I need to perform a query against DBpedia:
SELECT DISTINCT ?poi ?lat ?long ?photos ?template ?type ?label WHERE {
?poi <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?poi <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?poi <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?poi <http://dbpedia.org/property/hasPhotoCollection> ?photos .
OPTIONAL {?poi <http://dbpedia.org/property/wikiPageUsesTemplate> ?template } .
OPTIONAL {?poi <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type } .
FILTER ( ?lat > x && ?lat < y &&
?long > z && ?long < ω &&
langMatches( lang(?label), "EN" ))
}
I'm guessing this information is scattered among different dumps (.nt) files and somehow the SPARQL endpoint serves us with a result set. I need to download these different .nt files locally (not all DBpedia), perform only once my query and store the results locally (I don't want to use the SPARQL endpoint).
I m a bit confused reading from this post:
So, you can load the entire DBPedia data into a single TDB location on disk (i.e. a single directory). This way, you can run SPARQL queries over it.
How do I load the DBpedia into a single TDB location, in Jena terms, if we got three .nt DBpedia files? How do we apply the above query on those .nt files? (Any code would help.)
Example, is this wrong?
String tdbDirectory = "C:\\TDB";
String dbdump1 = "C:\\Users\\dump1_en.nt";
String dbdump2 = "C:\\Users\\dump2_en.nt";
String dbdump3 = "C:\\Users\\dump3_en.nt";
Dataset dataset = TDBFactory.createDataset(tdbDirectory);
Model tdb = dataset.getDefaultModel(); //<-- What is the default model?Should I care?
//Model tdb = TDBFactory.createModel(tdbdirectory) ;//<--is this prefered?
FileManager.get().readModel( tdb, dbdump1, "N-TRIPLES" );
FileManager.get().readModel( tdb, dbdump2, "N-TRIPLES" );
FileManager.get().readModel( tdb, dbdump3, "N-TRIPLES" );
String q = "my big fat query";
Query query = QueryFactory.create(q);
QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
ResultSet results = qexec.execSelect();
while (results.hasNext()) {
//do something significant with it
}
qexec.close()
tdb.close() ;
dataset.close();
"dataset.getDefaultModel"
(to get the default graph as a Jena Model
). Is this statement valid? Do we need to create a dataset to perform the query, or should we go with TDBFactory.createModel(tdbdirectory)
?To let Jena index locally :
/** The Constant tdbDirectory. */
public static final String tdbDirectory = "C:\\TDBLoadGeoCoordinatesAndLabels";
/** The Constant dbdump0. */
public static final String dbdump0 = "C:\\Users\\Public\\Documents\\TDB\\dbpedia_3.8\\dbpedia_3.8.owl";
/** The Constant dbdump1. */
public static final String dbdump1 = "C:\\Users\\Public\\Documents\\TDB\\geo_coordinates_en\\geo_coordinates_en.nt";
...
Model tdbModel = TDBFactory.createModel(tdbDirectory);<\n>
/*Incrementally read data to the Model, once per run , RAM > 6 GB*/
FileManager.get().readModel( tdbModel, dbdump0);
FileManager.get().readModel( tdbModel, dbdump1, "N-TRIPLES");
FileManager.get().readModel( tdbModel, dbdump2, "N-TRIPLES");
FileManager.get().readModel( tdbModel, dbdump3, "N-TRIPLES");
FileManager.get().readModel( tdbModel, dbdump4, "N-TRIPLES");
FileManager.get().readModel( tdbModel, dbdump5, "N-TRIPLES");
FileManager.get().readModel( tdbModel, dbdump6, "N-TRIPLES");
tdbModel.close();
To query Jena:
String queryStr = "dbpedia query ";
Dataset dataset = TDBFactory.createDataset(tdbDirectory);
Model tdb = dataset.getDefaultModel();
Query query = QueryFactory.create(queryStr);
QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
/*Execute the Query*/
ResultSet results = qexec.execSelect();
while (results.hasNext()) {
// Do something important
}
qexec.close();
tdb.close() ;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With