Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looking for a scalar function to find the last occurrence of a character in a string

Tags:

sql

db2

Table FOO has a column FILEPATH of type VARCHAR(512). Its entries are absolute paths:

FILEPATH
------------------------------------------------------------
file://very/long/file/path/with/many/slashes/in/it/foo.xml
file://even/longer/file/path/with/more/slashes/in/it/baz.xml
file://something/completely/different/foo.xml
file://short/path/foobar.xml

There's ~50k records in this table and I want to know all distinct filenames, not the file paths:

foo.xml
baz.xml
foobar.xml

This looks easy, but I couldn't find a DB2 scalar function that allows me to search for the last occurrence of a character in a string. Am I overseeing something?

I could do this with a recursive query, but this appears to be overkill for such a simple task and (oh wonder) is extremely slow:

WITH PATHFRAGMENTS (POS, PATHFRAGMENT) AS (
  SELECT
    1,
    FILEPATH
  FROM FOO

UNION ALL

  SELECT
    POSITION('/', PATHFRAGMENT, OCTETS) AS POS, 
    SUBSTR(PATHFRAGMENT, POSITION('/', PATHFRAGMENT, OCTETS)+1) AS PATHFRAGMENT
  FROM PATHFRAGMENTS
)
SELECT DISTINCT PATHFRAGMENT FROM PATHFRAGMENTS WHERE POS = 0
like image 921
Alexander Tobias Bockstaller Avatar asked Apr 11 '14 07:04

Alexander Tobias Bockstaller


2 Answers

I think what you're looking for is the LOCATE_IN_STRING() scalar function. This is what Info Center has to say if you use a negative start value:

If the value of the integer is less than zero, the search begins at LENGTH(source-string) + start + 1 and continues for each position to the beginning of the string.

Combine that with the LENGTH() and RIGHT() scalar functions, and you can get what you want:

SELECT
    RIGHT(
         FILEPATH
        ,LENGTH(FILEPATH) - LOCATE_IN_STRING(FILEPATH,'/',-1)
    )
FROM FOO
like image 171
bhamby Avatar answered Nov 14 '22 22:11

bhamby


One way to do this is by taking advantage of the power of DB2s XQuery engine. The following worked for me (and fast):

SELECT DISTINCT XMLCAST(
     XMLQuery('tokenize($P, ''/'')[last()]' PASSING FILEPATH AS "P")
   AS VARCHAR(512) )
FROM FOO

Here I use tokenize to split the file path into a sequence of tokens and then select the last of these tokens. The rest is only conversion from SQL to XML types and back again.

like image 42
Alexander Tobias Bockstaller Avatar answered Nov 14 '22 23:11

Alexander Tobias Bockstaller