In my SQL statement I have to extract a substring from a string at the character '_'. Strings can be for example 'A_XXX' 'AB_XXX' 'ABC_XXXX', so the extracted substrings should be like 'A' 'AB' 'ABC'.
In Oracle this is easy with the substr() and instr() functions:
select substr('AB_XXX', 1, instr('AB_XXX', '_')-1) as substring
from dual;
The result would be:
SUBSTRING
------------------------
AB
I need this query to check if a specific substring is in an array of strings.
The whole query would look like:
select 'AB_XXX' from dual
where (instr('ABC_AB_A', substr('AB_XXX', 1, instr('AB_XXX', '_')-1))>0);
Is there a way to write it in SQL-Standard?
Thanks in advance for your help.
Edit:
If PostgreSQL provides an alternative function, it also helps. The rest could be solved with e.g. IN. The really important part is to get the substring.
The STRING_SPLIT(string, separator) function in SQL Server splits the string in the first argument by the separator in the second argument. To split a sentence into words, specify the sentence as the first argument of the STRING_SPLIT() function and ' ' as the second argument. FROM STRING_SPLIT( 'An example sentence.
The PostgreSQL LEFT() function returns the first n characters in the string.
Your second example is a bit confusing because you are mixing 'ABC_AB_A'
and 'AB_XXX'
not sure if that is typo.
But if you just want all characters before the first _
then the following works in Postgres:
left(col, strpos(col, '_') - 1)
or using a regular expression:
substring(col from '([A-Z]+)(_{1})')
You can use a regular expression in Oracle as well:
regexp_substr(col, '([A-Z]+)(_{1})', 1, 1, 'i', 1)
Postgres' substring
function always returns the first capturing group of the regex whereas in Oracle you can specify the group you want: that is the last parameter to the regexp_substr()
function.
SQLFiddle for Oracle: http://sqlfiddle.com/#!4/b138c/1
SQLFiddle for Postgres: http://sqlfiddle.com/#!15/4b2bb/1
Use split_part
which was purposely built for this:
split_part(string, '_', 1)
Quoting this API docs:
SPLIT_PART()
function splits a string on a specified delimiter and returns the nth substring.
The 3 parameters are the string to be split, the delimiter, and the part/substring number (starting from 1) to be returned.
So if you have a field named string
that contains stuff like AB_XXX
and you would like to get everything before _
, then you split by that and get the first part/substring: split_part(string, '_', 1)
.
The standard SQL string functions are described at: SQL String Functions and Operators.
There's a substring
function that can extract contents directly, without having to nest function calls. It's detailed in Pattern matching as:
The substring function with three parameters, substring(string from pattern for escape-character), provides extraction of a substring that matches an SQL regular expression pattern. As with SIMILAR TO, the specified pattern must match the entire data string, or else the function fails and returns null. To indicate the part of the pattern that should be returned on success, the pattern must contain two occurrences of the escape character followed by a double quote ("). The text matching the portion of the pattern between these markers is returned.
In your case:
select substring('AB_XX' from '#"%#"#_%' for '#');
Result:
substring ----------- AB (1 row)
The syntax is a bit weird, especially since _
is a wildcard for a single character so it has to be quoted, but it's the SQL standard.
For the syntax that more people use, consider regexp_replace()
or a similar function working with POSIX regular expressions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With