Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle case-insensitive SPARQL data in MarkLogic

I'm trying to understand how best to handle literals in Marklogic SPARQL data which may be in any case. I'd like to be able to do a case insensitive search but I believe that isn't possible with semantic queries. For a simplistic example I want:

SELECT *
WHERE { ?s ?p "Red"}

and

SELECT *
WHERE { ?s ?p "red"}

to return all values whether the object is "Red", "RED", "red" or "rED".

My data is from another source which has variable capitalisation rules. At the moment the only thing I can think of is to add an extra triple which always contains the text in lower case so I can always search on that value. Alternatively, would it make sense to create some new range query in MarkLogic with a case insensitive collation (if that's possible on triple data)?

like image 206
Millstone1998 Avatar asked Dec 02 '14 17:12

Millstone1998


1 Answers

You could use a filter that ignores case.

select * where {
  ?s ?p ?o
  FILTER (lcase(str(?o)) = "red")
}

Based on the answer to another question.

Edit: I asked Steve Buxton, MarkLogic's PM for semantics features, and he suggested this:

let $store := sem:store( (), cts:element-value-query(xs:QName("sem:object"), "red", "case-insensitive") )
return
  sem:sparql('
    SELECT ?o
    WHERE {
      ?s ?p ?o
      FILTER (lcase(str(?o)) = "red")
    }', (), (), $store
 )

sem:store is a MarkLogic 8 (now available through Early Access) function that selects a group of triples. The SPARQL query then runs on the reduced set, limiting the number of triples that need to be filtered.

like image 55
Dave Cassel Avatar answered Nov 20 '22 18:11

Dave Cassel