What is the difference between DISTINCT
and REDUCED
in SPARQL?
SPARQL sees your data as a directed, labeled graph, that is internally expressed as triples consisting of subject, predicate and object. Correspondingly, a SPARQL query consists of a set of triple patterns in which each element (the subject, predicate and object) can be a variable (wildcard).
CAPS: Though SPARQL is case-insensitive, SPARQL keywords in this section are written in uppercase for readability. Italics: Terms in italics are placeholder values that you replace in the query.
offset shifts the cursor to the position you mention.
"PREFIX", however (without the "@"), is the SPARQL instruction for a declaration of a namespace prefix. It allows you to write prefixed names in queries instead of having to use full URIs everywhere. So it's a syntax convenience mechanism for shorter, easier to read (and write) queries.
REDUCED is like a 'best effort' DISTINCT. Whereas DISTINCT guarantees no duplicated results, REDUCED may eliminate some, all, or no duplicates.
What's the point? Well DISTINCT can be expensive; REDUCED can do the straightforward de-duplication work (e.g. remove immediately repeated results) without having to remember every row. In many applications that's good enough.
Having said that I've never used REDUCE, I've never seen anyone use REDUCED, and never seen REDUCED mentioned in a talk or tutorial.
In my mind (and in my own SPARQL implementation) REDUCED is effectively an optional DISTINCT constraint which is only applied if the engine deems it to be necessary i.e. the query engine will decide whether or not to eliminate duplicate results based on the query
In my own implementation I only eliminate duplicates when REDUCED has been used if OFFSET/LIMIT has also been used
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With