Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract a substring using a regular expression?

Tags:

regex

sparql

I do encounter strings in a query like this

o = 'some interesting {foo123:bar_675:get_me.xyz} string'

and I want to extract the part after the last colon from the part inside the curly brackets, so in this case get_me.xyz.

I know that the regex \{.*:(.*)\} would work (tested in Python):

import re

o = 'some interesting {foo123:bar_675:get_me.xyz} string'
re.findall('\{.*:(.*)\}', o)

will return

['get_me.xyz']

How could I use this regex now in a query?

I tried

SELECT (regex(?o, "\{.*:(.*)\}") as ?substring) ?o  
WHERE { 
  ?s ?p ?o .   
}

But that always throws an error

Parse error on line 6:
...SELECT (regex(?o, "\{.*:(.*)\}") as ?
---------------------^
Expecting 'IRIREF', 'PNAME_NS', 'VAR', '(', 'INTEGER', '!', '-', 'FUNC_ARITY0', 'FUNC_ARITY1', 'FUNC_ARITY2', 'IF', 'BOUND', 'BNODE', 'EXISTS', 'COUNT', 'FUNC_AGGREGATE', 'GROUP_CONCAT', 'DECIMAL', 'DOUBLE', 'true', 'false', 'STRING_LITERAL1', 'STRING_LITERAL2', 'STRING_LITERAL_LONG1', 'STRING_LITERAL_LONG2', 'INTEGER_POSITIVE', 'DECIMAL_POSITIVE', 'DOUBLE_POSITIVE', 'INTEGER_NEGATIVE', 'DECIMAL_NEGATIVE', 'DOUBLE_NEGATIVE', 'PNAME_LN', '+', 'NOT', 'CONCAT', 'COALESCE', 'SUBSTR', 'REGEX', 'REPLACE', got 'INVALID'
like image 301
Cleb Avatar asked Oct 17 '25 10:10

Cleb


1 Answers

REGEX is a filter test, REPLACE is the extraction operation.

SELECT *
WHERE { 
  ?s ?p ?o .   
  FILTER REGEX(?o, "\\{.*:(.*)\\}")
}

which tests ?o, and does not extract the () part.

Note the double \\.

To extract use BIND-REPLACE.

SELECT * {
  ?s ?p ?o .   
  BIND(REPLACE(?o, "^.*\\{.*:(.*)\\}.*$", "$1") AS ?substring)
}

In the general case, you may need str(?o) instead of ?o in functions.

like image 140
AndyS Avatar answered Oct 19 '25 00:10

AndyS



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!