Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL REGEXP: matching blank entries

Tags:

regex

mysql

I have this SQL condition that is supposed to retrieve all rows that satisfy the given regexp condition:

country REGEXP ('^(USA|Italy|France)$')

However, I need to add a pattern for retrieving all blank country values. Currently I am using this condition

country REGEXP ('^(USA|Italy|France)$') OR country = ""

How can achieve the same effect without having to include the OR clause?

Thanks, Erwin

like image 217
Erwin Avatar asked Jun 17 '10 23:06

Erwin


People also ask

Is MySQL RegEx case sensitive?

Note: As MySQL uses the C escape syntax in strings (for example, “\n” to represent the newline character), you must double any “\” that you use in your REGEXP strings. REGEXP is not case sensitive, except when used with binary strings.

What is the difference between like and RegEx operators in MySQL?

Basically, LIKE does very simple wildcard matches, and REGEX is capable of very complicated wildcard matches. In fact, regular expressions ( REGEX ) are so capable that they are [1] a whole study in themselves [2] an easy way to introduce very subtle bugs.

Which operator is used in RegEx to find data from start end?

“REGEXP 'pattern'” REGEXP is the regular expression operator and 'pattern' represents the pattern to be matched by REGEXP.


3 Answers

This should work:

country REGEXP ('^(USA|Italy|France|)$')

However from a performance point of view, you may want to use the IN syntax

country IN ('USA','Italy','France', '')

The later should be faster as REGEXP can be quite slow.

like image 59
Ben Rowe Avatar answered Oct 09 '22 08:10

Ben Rowe


There's no reason you can't use the $ (match end of string) to fill in your "empty subexpression" issue...

It looks a little weird but country REGEXP ('^(USA|Italy|France|$)$') will actually work

like image 20
gnarf Avatar answered Oct 09 '22 08:10

gnarf


You could try:

country REGEXP ('^(USA|Italy|France|)$')

I just added another | after France, which should would basically tell it to also match ^$ which is the same as country = ''.

Update: since this method doesn't work, I would recommend you use this regex:

country REGEXP ('^(USA|Italy|France)$|^$')

Note that you can't use the regex: ^(USA|Italy|France|.{0})$ because it will complain that there is an empty sub expression. Although ^(USA|Italy|France)$|^.{0}$ would work.

Here are some examples of the return value of this regex:

select '' regexp '^(USA|Italy|France)$|^$'
> 1
select 'abc' regexp '^(USA|Italy|France)$|^$'
> 0
select 'France' regexp '^(USA|Italy|France)$|^$'
> 1
select ' ' regexp '^(USA|Italy|France)$|^$'
> 0

As you can see, it returns exactly what you want.

If you want to treat blank values the same (e.g. 0 spaces and 5 spaces both count as blank), you should use the regex:

country REGEXP ('^(USA|Italy|France|\s*)$')

This will cause the last row in the previous example to behave differently, i.e.:

select ' ' regexp '^(USA|Italy|France|\s*)$'
> 1
like image 34
Senseful Avatar answered Oct 09 '22 09:10

Senseful