How can I convert a simple tab-delimited txt file (containing the headers subject, predicate, object) into a RDF N-triple format using the python module RDFlib?
It's not very complicated. First, some necessary imports:
from StringIO import StringIO
from rdflib import Graph, URIRef
I'm using StringIO
here to avoid creating a file. Instead, I'll just list some contents and a file-like object with these contents:
contents = '''\
subject1\tpredicate1\tobject1
subject2\tpredicate2\tobject2'''
tabfile = StringIO(contents)
Then create a graph and load all triples to it:
graph = rdflib.Graph()
for line in tabfile:
triple = line.split() # triple is now a list of 3 strings
triple = (URIRef(t) for t in triple) # we have to wrap them in URIRef
graph.add(triple) # and add to the graph
Now you have the whole graph in memory (assuming you have enough memory, of course). You can now print it:
print graph.serialize(format='nt')
# prints:
# <subject1> <predicate1> <object1> .
# <subject2> <predicate2> <object2> .
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With