Convert huge linked data dumps (RDF/XML, JSON-LD, TTL) to TSV/CSV

Question

Linked data collections are usually given in RDF/XML, JSON-LD, or TTL format. Relatively large data dumps seem fairly difficult to process. What is a good way to convert an RDF/XML file to a TSV of triplets of linked data?

I've tried OpenRefine, which should handle this, but a 10GB file, (e.g. the person authority information from German National Library) is too difficult to process on a laptop with decent processing power.

Looking for software recommendations or some e.g. Python/R code to convert it. Thanks!

jschnasse · Accepted Answer

Try these:

Lobid GND API

http://lobid.org/gnd/api

Supports OpenRefine (see blogpost) and a variety of other queries. The data is hosted as JSON-LD (see context) in an elasticsearch cluster. The service offers a rich HTTP-API.

Use a Triple Store

Load the data to a triple store of your choice, e.g. rdf4j. Many triple stores provide some sort of CSV serialization. Together with SPARQL this could be worth a try.

Catmandu

http://librecat.org/Catmandu/

A strong perl based data toolkit that comes with a useful collection of ready-to-use transformation pipelines.

Metafacture

https://github.com/metafacture/metafacture-core/wiki

A Java-Toolkit to design transformation pipelines in Java.

Yahalnaut · Answer

You could use the ontology editor Protege: There, you can SPARQL the data according to your needs and save them as TSV file. It might be important, however, to configure the software beforehand in order to make the amounts of data manageable.

Convert huge linked data dumps (RDF/XML, JSON-LD, TTL) to TSV/CSV

Tags:

csv

converters

rdf

json-ld

linked-data

puslet88

2 Answers

jschnasse

Yahalnaut

Recent Activity

Donate For Us

Convert huge linked data dumps (RDF/XML, JSON-LD, TTL) to TSV/CSV

Tags:

csv

converters

rdf

json-ld

linked-data

puslet88

2 Answers

jschnasse

Yahalnaut

Related questions

Recent Activity

Donate For Us