Metadata as Linked Data for Research Data Repositories

METADATA

Field	Value
rdf:type	r4r:Article
r4r:locateAt	http://data.odw.tw/article/20170308
dc:coverage	台北, 台灣 Taipei, Taiwan
dc:creator	李承錱, 黃韋菁, 莊庭瑞 Cheng-Jen Lee, Andrea Wei-Ching Huang and Tyng-Ruey Chuang
dc:date	2017-03-08
dc:description	Data repository has long become an effective mechanism for integrating and managing the research resources. And how to improve the reuse value of research resources is one of the key issue regarding data repository. Recently, publishing those resources in the form of linked data with referenced interlinks and rich semantics on the world wide web draws much attention from data repository owners. In this study, we first converted 840,000 XML-formatted CC-licensed digital resources from the Union Catalog of Digital Archives Taiwan (http://catalog.digitalarchives.tw/) into human-editable CSV tabular data. Those tabular data are then transformed into linked data in RDF (Resource Description Framework). The XML-CSV-RDF process ensures accessibility to resources for both human and machine. An ontology (voc4odw) is also designed to describe each resource and its provenance. Two types of linked data resources are generated: one type is simply described by Dublin Core’s 15 vocabularies, and the other type is enriched by domain vocabularies from external datasets including Wikidata, GeoNames, and Encyclopedia of Life. Furthermore, for the second type of linked data resources, there may exist several “versions” of resources with different domain vocabularies to provide more insight into the data. For demonstrating generated linked data resources, a linked open data (LOD) repository http://data.odw.tw is built using CKAN (Comprehensive Knowledge Archive Network). CKAN is an open source data management system equipped with comprehensive functions for publishing, storing, managing, showing, and using data, including both raw data and metadata. Resources in RDF format were loaded to our CKAN instance through a custom harvesting method, which also provides the ability to export each resource in various linked data formats such as RDF-XML, turtle, and JSON-LD. And thanks to CKAN’s high flexibility in metadata customizations and data preview methods, we made some extensions to CKAN so that people can explore all linked data resources in well-formed table views, find resources by keywords, facets, time ranges, or spatial extents. Meanwhile, a SPARQL endpoint provided by Openlink Virtuoso is integrated into CKAN’s interface for advanced usage. The ongoing works include linking to more external datasets, improving the import process, and aggregating existing resources to infer and obtain “new knowledges” with CKAN’s data representation capabilities.
dc:format	PDF
dc:identifier	DOI: 10.13140/RG.2.2.30887.34724
dc:language	English
dc:publisher	International Symposium on Grids and Clouds (ISGC) 2017
dc:relation	http://indico4.twgrid.org/indico/event/2/
dc:rights	CC BY 4.0
dc:source	http://m.odw.tw/u/odw/m/metadata-as-linked-data-for-research-data-repositories/ http://event.twgrid.org/isgc2017/ http://m.odw.tw/u/odw/m/metadata-as-linked-data-for-research-data-repositories-sls/
dc:subject	CKAN, LOD, semantics, archive, catalog, metadata, research repository
dc:title	Metadata as Linked Data for Research Data Repositories
dc:type	Conference Poster
r4r:hasProvenance	http://data.odw.tw/article/p20170309-20170308