Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string at specific character SQL-Standard

In my SQL statement I have to extract a substring from a string at the character '_'. Strings can be for example 'A_XXX' 'AB_XXX' 'ABC_XXXX', so the extracted substrings should be like 'A' 'AB' 'ABC'.

In Oracle this is easy with the substr() and instr() functions:

select substr('AB_XXX', 1, instr('AB_XXX', '_')-1) as substring
from dual;

The result would be:

SUBSTRING
------------------------
AB

I need this query to check if a specific substring is in an array of strings.

The whole query would look like:

select 'AB_XXX' from dual
where (instr('ABC_AB_A', substr('AB_XXX', 1, instr('AB_XXX', '_')-1))>0);

Is there a way to write it in SQL-Standard?

Thanks in advance for your help.

Edit:

If PostgreSQL provides an alternative function, it also helps. The rest could be solved with e.g. IN. The really important part is to get the substring.

like image 305
user3906778 Avatar asked Apr 27 '15 12:04

user3906778


People also ask

How do I split a string in SQL?

The STRING_SPLIT(string, separator) function in SQL Server splits the string in the first argument by the separator in the second argument. To split a sentence into words, specify the sentence as the first argument of the STRING_SPLIT() function and ' ' as the second argument. FROM STRING_SPLIT( 'An example sentence.

How do I get the first character of a string in PostgreSQL?

The PostgreSQL LEFT() function returns the first n characters in the string.


3 Answers

Your second example is a bit confusing because you are mixing 'ABC_AB_A' and 'AB_XXX' not sure if that is typo.

But if you just want all characters before the first _ then the following works in Postgres:

left(col, strpos(col, '_') - 1)

or using a regular expression:

substring(col from '([A-Z]+)(_{1})')

You can use a regular expression in Oracle as well:

regexp_substr(col, '([A-Z]+)(_{1})', 1, 1, 'i', 1)

Postgres' substring function always returns the first capturing group of the regex whereas in Oracle you can specify the group you want: that is the last parameter to the regexp_substr() function.

SQLFiddle for Oracle: http://sqlfiddle.com/#!4/b138c/1
SQLFiddle for Postgres: http://sqlfiddle.com/#!15/4b2bb/1

like image 151
a_horse_with_no_name Avatar answered Oct 12 '22 11:10

a_horse_with_no_name


tl;dr

Use split_part which was purposely built for this:

split_part(string, '_', 1)

Explanation

Quoting this API docs:

SPLIT_PART() function splits a string on a specified delimiter and returns the nth substring.

The 3 parameters are the string to be split, the delimiter, and the part/substring number (starting from 1) to be returned.

So if you have a field named string that contains stuff like AB_XXX and you would like to get everything before _, then you split by that and get the first part/substring: split_part(string, '_', 1).

like image 31
totymedli Avatar answered Oct 12 '22 11:10

totymedli


The standard SQL string functions are described at: SQL String Functions and Operators.

There's a substring function that can extract contents directly, without having to nest function calls. It's detailed in Pattern matching as:

The substring function with three parameters, substring(string from pattern for escape-character), provides extraction of a substring that matches an SQL regular expression pattern. As with SIMILAR TO, the specified pattern must match the entire data string, or else the function fails and returns null. To indicate the part of the pattern that should be returned on success, the pattern must contain two occurrences of the escape character followed by a double quote ("). The text matching the portion of the pattern between these markers is returned.

In your case:

select substring('AB_XX' from '#"%#"#_%' for '#');

Result:

 substring 
-----------
 AB
(1 row)

The syntax is a bit weird, especially since _ is a wildcard for a single character so it has to be quoted, but it's the SQL standard.

For the syntax that more people use, consider regexp_replace() or a similar function working with POSIX regular expressions.

like image 4
Daniel Vérité Avatar answered Oct 12 '22 13:10

Daniel Vérité