Sparql query with Blank node can be complex

Tags:

I read this blog article, Problems of the RDF model: Blank Nodes, and there's mentioned that using blank nodes can complicate the handling of data.

Can you give me an example why using blank nodes is difficult to perform a SPARQL query? I do not understand the complexity of blank nodes. Can you explain me the meaning and semantics of an existential variable? I do not understand clearly this explanation given in the RDF Semantics Recommendation, 1.5. Blank Nodes as Existential Variables.

403

asked Dec 17 '13 08:12

Competo

1 Answers

Existential Variables

In the (first-order) predicate calculus, there is existential quantification which lets us make assertions about things that exist, without saying (or, possibly, knowing) which specific individuals in the domain we're actually talking about. For instance, a sentence like

hasUserId(JoshuaTaylor,1281433)

entails the sentence

∃x.hasUserId(x,1281433)

Of course, there are lots of scenarios in which the second sentence could be true without the first one being true. In that sense, the second sentence gives us less information than the first. It's also important to note that the variable x in the second sentence doesn't provide any way to find out which element in the domain of discourse actually has the given userId. It also also doesn't make any claim that there's only one such thing that has the given user id. To make that clearer, we might use an example:

∃y.hasAge(y,29)

This is presumably true, since someone or something out there is age 29. Note that we can't talk about y as the individual that is age 29, though, because there could be lots of them. All this sentence tells us is that there is at least one.

Even though we used different variables in the two sentences, there's nothing to say that the individuals with the specified properties might not be the same. This is particularly important in nested quantification, e.g.,

∃x.∃y.likes(x, y)

This sentence could be true because there is one individual in the domain that likes itself. just because x and y have different names in the sentence doesn't mean that they might not refer to the same individual.

Blank Nodes as Existential Variables

There is a defined RDF entailment model defined in RDF Semantics. This has been described more in another Stack Overflow question, RDF Graph Entailment. The idea is that an RDF graph is treated a big existential quantification over the blank nodes mentioned in the graph. E.g., if the triples in the graph are t₁, …, t_n, and the blank nodes that appear in those triples are b₁, …, b_m, then the graph is a formula:

∃b₁, …, b_m.(t₁ &wedge; … &wedge; t_n)

Based on the discussion of the existential variables above, note that this means that blank nodes in the data might refer to same element of the domain, or different elements, and that it's not required that exactly one element could take the place of a blank node. This means that a graph with blank nodes, when interpreted in this manner, provides much less information than you might expect.

Blank Nodes in Real Data

Now, the discussion above is useful if people are using blank nodes as existential variables. In many cases, authors think of them more as anonymous, but definite and distinct objects. E.g., if we casually write

@prefix : <https://stackoverflow.com/q/20629437/1281433/> .

:Carol :hasAddress [ :hasNumber 4222 ;
                     :hasStreet :Clinton_Way ] .

we may well be trying to say that there is a single address out there with the specified properties, but according to the RDF entailment model, that's not what we're doing.

In practice, this isn't so much of a problem, because we're usually not using RDF entailment. What is a problem though is that since the scope of blank variables is local to a graph, we can't run a SPARQL query against an endpoint asking for Carol's address and get back an IRI that we can reuse. If we run a query like this:

prefix : <https://stackoverflow.com/q/20629437/1281433/>

construct {
  :Mike :hasAddress ?address
}
where {
  :Carol :hasAddress ?address
}

then we get back the following (unhelpful) graph as a result:

@prefix :      <https://stackoverflow.com/q/20629437/1281433/> .

:Mike   :hasAddress  []  .

We won't have a way to get more information about the address because all we have now is a blank node. If we had used IRIs, e.g.,

@prefix : <https://stackoverflow.com/q/20629437/1281433/> .

:Carol :hasAddress :address1267389 .
:address1267389 :hasNumber 4222 ;
                :hasStreet :Clinton_Way .

then the query would have produced something more helpful:

@prefix :      <https://stackoverflow.com/q/20629437/1281433/> .

:Mike   :hasAddress  :address1267389 .

Why is this more useful? The first case is like having the data

∃ x.(hasAddress(Carol,x) &wedge; hasNumber(x,4222) &wedge; hasStreet(x,ClintonWay))

and getting back a result

∃ y.hasAddress(Mike,y)

Sure, it's possible that Mike and Carol have the same address, but from these sentences there's no way to know for sure. It's much more helpful to have data like

hasAddress(Carol,address1267389)
hasNumber(address1267389,4222)
hasStreet(address1267389,ClintonWay))

and getting back a result

hasAddress(Mike,address1267389)

From this, you know that they have the same address, and you can ask things about it.

Conclusion

How much this will affect your data and its consumers depends on what the typical use cases are. For automatically constructed graphs, it may be hard to know in advance just what kind of data you'll need to be able to refer to later, so it's a good idea to generate IRIs for as many of your resources as you can. Since IRIs are free-form, it's usually not too hard to do this. For instance, if you've got some sensible “base” IRI, e.g.,

http://example.org/myData/

then you can easily append suffixes to identify your resources. E.g.,

http://example.org/myData/addresses/addr1
http://example.org/myData/addresses/addr2
http://example.org/myData/addresses/addr3
http://example.org/myData/individuals/ind34
http://example.org/myData/individuals/ind35

150

answered Oct 27 '22 21:10

Joshua Taylor

Related questions
                            
                                Rename a graph with sparql update
                            
                                How to get started with RDF? (Particularly for relational database developers?)
                            
                                Parameterized SPARQL query with JENA
                            
                                django RDF support?
                            
                                SPARQL - Select the most relevant category of a dbpedia resource
                            
                                Combine multiple sets of rows in SPARQL
                            
                                How can you remove the XML schema datattype from sparql query?
                            
                                How to run IN and NOT IN SPARQL statements in python rdflib to remove the intersection of two graphs
                            
                                How do I consume a sparql endpoint - such as DBPedia in an iphone app
                            
                                SPARQL queries with relational operator
                            
                                Use Jena to query wikidata
                            
                                Measuring distances among classes in RDF/OWL graphs
                            
                                simple sparql query from dbpedia
                            
                                INSERT/DELETE/UPDATE query using SPARQLWrapper
                            
                                How to run a sparQL query?
                            
                                SPARQL Query gives unexpected result
                            
                                Getting DBPedia Infobox categories
                            
                                SPARQL: How to get an insance of an ontology, if depth of the class hierarchy is unknown?
                            
                                Finding shortest path with SPARQL query
                            
                                SPARQL Query: How I get only a literal or string as result?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sparql query with Blank node can be complex

Tags:

rdf

semantic-web

sparql

linked-data

blank-nodes