Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jena: How to infer data / performance issues

I'd like to use Jena's inference capabilities, but I'm having some performance problems when I am using InfModel.

Here's a simplified overview of my ontology:

Properties:

hasX            (Ranges(intersection): X, inverse properties: isXOf)
|-- hasSpecialX (Ranges(intersection): X, inverse properties: isSpecialXOf)

isXOf           (Domains(intersection): X, inverse properties: hasX)
|--isSpecialXOf (Domains(intersection): X, inverse properties: hasSpecialX)

Furthermore there's a class 'Object':

Object hasSpecialX some X

Explicitly stored is the following data:

SomeObject a Object 
SomeX a X
SomeObject hasSpecialX SomeX  

Using the following query I'd like to determine to which class an instance belongs. According to the assumptions made, only 'SomeObject' should be returned.

SELECT ?x WHERE { ?x :hasX :SomeX . } 

However, querying against ds.getDefaultModel() doesn't work, because the data isn't stored explicitly. When I'm using infModel on the other hand, the query never finishes. At the longest I've been waiting for 25 minutes before aborting. (The triplestore has a size of about 180 MB)

This is my code:

OntModel ont = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM_MICRO_RULE_INF, null); 
ont.read("file:..." , "RDF/XML"); 

Reasoner reasoner = ReasonerRegistry.getOWLMicroReasoner(); 
reasoner = reasoner.bindSchema(ont); 

Dataset dataset = TDBFactory.createDataset(...); 
Model model = dataset.getDefaultModel(); 

InfModel infModel = ModelFactory.createInfModel(reasoner, model);

QueryExecution qe = null;
ResultSet rs;

try {
    String qry = "SELECT ?x WHERE { ?x :hasX :SomeX . }"; 
    qe = QueryExecutionFactory.create(qry, infModel); 
    rs = qe.execSelect(); 

    while(rs.hasNext()) {
        QuerySolution sol = rs.nextSolution(); 
        System.out.println(sol.get("x"));
    }
} finally {
    qe.close();
    infModel.close();
    model.close(); 
    dataset.close();
}

Is there anything wrong with the code above, or what else could be the reason it doesn't work?

Beside that, I'd like to know if I can increase the performance if I do 'Export inferred axioms as ontology' (as provided by Protege)?

EDIT: I the meantime I've tried to use Pellet, but still I can't get an inferred model, as I've described in my other question: OutOfMemoryError using Pellet as Reasoner. So what else can I do?

like image 439
Pedro Avatar asked Apr 16 '12 20:04

Pedro


1 Answers

Regarding performance, it is better to do the inference before asserting the data and not and do the SPARQLs with the Jena inference mechanism off. You are already using TDB which the right Jena component for big datasets.

If by using the inferred data directly you do not get the expected performance then I recommend moving to a more scalable triple store (4store or Virtuoso).

like image 151
Manuel Salvadores Avatar answered Sep 28 '22 00:09

Manuel Salvadores