Metadata as Linked Data for Research Data Repositories


Field Value
rdf:type r4r:Article
  • 台北, 台灣
  • Taipei, Taiwan
  • 李承錱, 黃韋菁, 莊庭瑞
  • Cheng-Jen Lee, Andrea Wei-Ching Huang and Tyng-Ruey Chuang
dc:date 2017-03-08
dc:description Data repository has long become an effective mechanism for integrating and managing the research resources. And how to improve the reuse value of research resources is one of the key issue regarding data repository. Recently, publishing those resources in the form of linked data with referenced interlinks and rich semantics on the world wide web draws much attention from data repository owners. In this study, we first converted 840,000 XML-formatted CC-licensed digital resources from the Union Catalog of Digital Archives Taiwan ( into human-editable CSV tabular data. Those tabular data are then transformed into linked data in RDF (Resource Description Framework). The XML-CSV-RDF process ensures accessibility to resources for both human and machine. An ontology (voc4odw) is also designed to describe each resource and its provenance. Two types of linked data resources are generated: one type is simply described by Dublin Core’s 15 vocabularies, and the other type is enriched by domain vocabularies from external datasets including Wikidata, GeoNames, and Encyclopedia of Life. Furthermore, for the second type of linked data resources, there may exist several “versions” of resources with different domain vocabularies to provide more insight into the data. For demonstrating generated linked data resources, a linked open data (LOD) repository is built using CKAN (Comprehensive Knowledge Archive Network). CKAN is an open source data management system equipped with comprehensive functions for publishing, storing, managing, showing, and using data, including both raw data and metadata. Resources in RDF format were loaded to our CKAN instance through a custom harvesting method, which also provides the ability to export each resource in various linked data formats such as RDF-XML, turtle, and JSON-LD. And thanks to CKAN’s high flexibility in metadata customizations and data preview methods, we made some extensions to CKAN so that people can explore all linked data resources in well-formed table views, find resources by keywords, facets, time ranges, or spatial extents. Meanwhile, a SPARQL endpoint provided by Openlink Virtuoso is integrated into CKAN’s interface for advanced usage. The ongoing works include linking to more external datasets, improving the import process, and aggregating existing resources to infer and obtain “new knowledges” with CKAN’s data representation capabilities.
dc:format PDF
dc:identifier DOI: 10.13140/RG.2.2.30887.34724
dc:language English
dc:publisher International Symposium on Grids and Clouds (ISGC) 2017
dc:rights CC BY 4.0
dc:subject CKAN, LOD, semantics, archive, catalog, metadata, research repository
dc:title Metadata as Linked Data for Research Data Repositories
dc:type Conference Poster


Field Value
rdf:type prov:Activity, r4r:Provenance
prov:wasAssociatedWith, agent:q20872470
prov:startedAtTime 2017-03-09T16:19:31.925827+08:00
prov:endedAtTime 2017-03-09T16:19:31.928124+08:00
prov:atLocation gns:6728700
r4r:hasLicense CC4.0:BY