I need to build a query in PostgreSQL and am required to find all text entries that contain a 6 digit number (e.g. 000999
, 019290
, 998981
, 234567
, etc). The problem is that the number is not necessary at the begining of the string or at its end.
I tried and didn't work:
[0-9]{6}
- returns part of a number with more than 6 digits(?:(?<!\d)\d{6}(?!\d))
- postgresql does not know about lookbehind[^0-9][0-9]{6}[^0-9]
and variations on it, but to no avail.Building my own Perl/C function is not really an option as I do not have the skills required. Any idea what regexp could be used or other tricks that elude me at the moment?
EDIT
Input samples:
aa 0011527 /CASA
-> should return NOTHING aa 001152/CASA
-> should return 001152
aa001152/CASA
-> should return 001152
aa0011527/CASA
-> should return NOTHINGaa001152 /CASA
-> should return 001152
If PostgreSQL supports word boundaries, use \b
:
\b(\d{6})\b
Edit:
\b
in PostgreSQL means backspace
, so it's not a word boundary.
http://www.postgresql.org/docs/8.3/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP however, will explain you that you can use \y
as a word boundary, as it means matches only at the beginning or end of a word
, so
\y(\d{6})\y
should work.
\m(\d{6})\M
should also work.
Full list of word matches in PostgreSQL regex:
Escape Description
\A matches only at the beginning of the string (see Section 9.7.3.5 for how this differs from ^)
\m matches only at the beginning of a word
\M matches only at the end of a word
\y matches only at the beginning or end of a word
\Y matches only at a point that is not the beginning or end of a word
\Z matches only at the end of the string (see Section 9.7.3.5 for how this differs from $)
New edit:
Based on your edit, you should be able to do this:
(^|[^\d])(\d+)([^\d]|$)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With