Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regexp to find only 6 digit number in text PostgreSQL

I need to build a query in PostgreSQL and am required to find all text entries that contain a 6 digit number (e.g. 000999, 019290, 998981, 234567, etc). The problem is that the number is not necessary at the begining of the string or at its end.

I tried and didn't work:

  • [0-9]{6} - returns part of a number with more than 6 digits
  • (?:(?<!\d)\d{6}(?!\d)) - postgresql does not know about lookbehind
  • [^0-9][0-9]{6}[^0-9] and variations on it, but to no avail.

Building my own Perl/C function is not really an option as I do not have the skills required. Any idea what regexp could be used or other tricks that elude me at the moment?

EDIT

Input samples:

  • aa 0011527 /CASA -> should return NOTHING
  • aa 001152/CASA -> should return 001152
  • aa001152/CASA -> should return 001152
  • aa0011527/CASA -> should return NOTHING
  • aa001152 /CASA -> should return 001152
like image 488
CristisS Avatar asked Jan 24 '13 14:01

CristisS


1 Answers

If PostgreSQL supports word boundaries, use \b:

\b(\d{6})\b

Edit:

\b in PostgreSQL means backspace, so it's not a word boundary.

http://www.postgresql.org/docs/8.3/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP however, will explain you that you can use \y as a word boundary, as it means matches only at the beginning or end of a word, so

\y(\d{6})\y

should work.

\m(\d{6})\M

should also work.

Full list of word matches in PostgreSQL regex:

Escape  Description
\A      matches only at the beginning of the string (see Section 9.7.3.5 for how this differs from ^)
\m      matches only at the beginning of a word
\M      matches only at the end of a word
\y      matches only at the beginning or end of a word
\Y      matches only at a point that is not the beginning or end of a word
\Z      matches only at the end of the string (see Section 9.7.3.5 for how this differs from $)

New edit:

Based on your edit, you should be able to do this:

(^|[^\d])(\d+)([^\d]|$)
like image 187
h2ooooooo Avatar answered Nov 15 '22 05:11

h2ooooooo