I wrote this query and return list of couples and particular condition. ( in http://live.dbpedia.org/sparql) <pre class="prettyprint"><code>SELECT DISTINCT ?actor ?person2 ?cnt WHERE { { select DISTINCT ?actor ?person2 (count (?film) as ?cnt) where { ?film dbo:starring ?actor . ?actor dbo:spouse ?person2. ?film dbo:starring ?person2. } order by ?actor } FILTER (?cnt >9) } </code></pre> Problem is that some rows is duplicate. example: http://dbpedia.org/resource/George_Burns http://dbpedia.org/resource/Gracie_Allen 12 http://dbpedia.org/resource/Gracie_Allen http://dbpedia.org/resource/George_Burns 12 how to remove these duplications? I added gender to ?actor but it damage current result.

It is not actual duplicates of course since you can look at it from both ways. The way to fix it if you want to is to add a filter. It is a bit of a dirty hack but it only takes on of the 2 rows that are the "same". <pre class="prettyprint"><code>SELECT DISTINCT ?actor ?person2 ?cnt WHERE { { select DISTINCT ?actor ?person2 (count (?film) as ?cnt) where { ?film dbo:starring ?actor . ?actor dbo:spouse ?person2. ?film dbo:starring ?person2. FILTER (?actor < ?person2) } order by ?actor } FILTER (?cnt >9) } </code></pre>

how to remove duplicates in sparql query

Tags:

sparql

dbpedia

I wrote this query and return list of couples and particular condition. ( in http://live.dbpedia.org/sparql)

SELECT DISTINCT ?actor ?person2 ?cnt
WHERE
{
{
    select DISTINCT ?actor ?person2 (count (?film) as ?cnt) 
    where { 
        ?film    dbo:starring ?actor .
        ?actor dbo:spouse ?person2. 
        ?film    dbo:starring ?person2.
    }
    order by ?actor
}
FILTER (?cnt >9)
}

Problem is that some rows is duplicate. example:

http://dbpedia.org/resource/George_Burns http://dbpedia.org/resource/Gracie_Allen 12

http://dbpedia.org/resource/Gracie_Allen http://dbpedia.org/resource/George_Burns 12

how to remove these duplications? I added gender to ?actor but it damage current result.

496

asked Apr 01 '16 04:04

NASRIN

2 Answers

Natan Cox's answer shows the typical way to exclude these kind of pseudo-duplicates. The results aren't actually duplicates, because in one, e.g., George Burns is the ?actor, and in the other he is the ?person2. In many cases, you can add a filter to require that the two things are ordered, and that will remove the duplicate cases. E.g., when you have data like:

:a :likes :b .
:a :likes :c .

and you search for

select ?x ?y where { 
  :a :likes ?x, ?y .
}

you can add filter(?x < ?y) to enforce an ordering between the between ?x and ?y which will remove these pseudo-duplicates. However, in this case, it's a bit trickier, since ?actor and ?person2 aren't found using the same critera. If DBpedia contains

:PersonB dbo:spouse :PersonA

but not

:PersonA dbo:spouse :PersonB

then the simple filter won't work, because you'll never find the triple where the subject PersonA is less than the object PersonB. So in this case, you also need to modify your query a bit to make the criteria symmetric:

select distinct ?actor ?spouse (count(?film) as ?count) {
  ?film dbo:starring ?actor, ?spouse .
  ?actor dbo:spouse|^dbo:spouse ?spouse .
  filter(?actor < ?spouse)
}
group by ?actor ?spouse
having (count(?film) > 9)
order by ?actor

(This query also shows that you don't need a subquery here, you can use having to "filter" on aggregate values.) But the important part is using the property path dbo:spouse|^dbo:spouse to find a value for ?spouse such that either ?actor dbo:spouse ?spouse or ?spouse dbo:spouse ?actor. This makes the relationship symmetric, so that you're guaranteed to get all the pairs, even if the relationship is only declared in one direction.

199

answered Oct 27 '22 18:10

Joshua Taylor

It is not actual duplicates of course since you can look at it from both ways. The way to fix it if you want to is to add a filter. It is a bit of a dirty hack but it only takes on of the 2 rows that are the "same".

SELECT DISTINCT ?actor ?person2 ?cnt
WHERE
{
{
    select DISTINCT ?actor ?person2 (count (?film) as ?cnt) 
    where { 
        ?film    dbo:starring ?actor .
        ?actor dbo:spouse ?person2. 
        ?film    dbo:starring ?person2.
FILTER (?actor < ?person2)


    }
    order by ?actor
}
FILTER (?cnt >9)
}

answered Oct 27 '22 19:10

Natan Cox

Related questions
                            
                                Is there any way to optimize SPARQL queries?
                            
                                how to extract synonyms from MeSH ontology?
                            
                                Retrieve dbpedia subject categories with SPARQL
                            
                                Parsing SPARQL queries
                            
                                SPARQL: combining variables with literals
                            
                                Fuseki GC overhead limit exceeded during data import
                            
                                Escape special characters in SPARQL queries
                            
                                Why does my SPARQL query return the URI of a resource instead of its name?
                            
                                SPARQL query to get only results with the most recent date
                            
                                sparql query to delete
                            
                                How to use json output from external Sparql request directly from browser?
                            
                                Getting a list of American physicists from DBpedia using SPARQL
                            
                                Boolean checks in SPARQL, check for existence of a statement
                            
                                How to exclude resources with a specific rdf:type from SPARQL results?
                            
                                Querying with Spaces Sparql
                            
                                Order SPARQL query results by length of a string?
                            
                                Understanding what SPARQL is
                            
                                How to match exact string literals in SPARQL?
                            
                                SPARQL query with COUNT and ORDER returns odd result
                            
                                SPARQL functions in CONSTRUCT/WHERE

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With