Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to remove ordinals

Tags:

regex

I need to remove ordinals via regex, but my regex skills are quite lacking. The following locates the ordinals, but includes the digit just prior in the return value. I need to isolate and remove just the ordinal.

[0-9](?:st|nd|rd|th)
like image 486
lcdservices Avatar asked May 31 '11 02:05

lcdservices


4 Answers

You need to use a look-behind assertion so that only st|nd|rd|th preceded by a [0-9] are matched, but the [0-9] isn't included in the match. i.e.:

(?<=[0-9])(?:st|nd|rd|th)

I've linked to the perl-compatible syntax, but if you're using posix, posix extended, vi or one of many other regex syntaxes you'll need to look up the syntax.

like image 68
joelhardi Avatar answered Nov 19 '22 06:11

joelhardi


In perl:

$var =~ s{\b(\d+)(?:st|nd|rd|th)\b}{$1};

In PHP:

$var = preg_replace('/\\b(\d+)(?:st|nd|rd|th)\\b/', '$1', $var);

In .NET:

var = Regex.Replace(@"\b(\d+)(?:st|nd|rd|th)\b", "$1");
like image 41
King Skippus Avatar answered Nov 19 '22 05:11

King Skippus


If you want to remove as well the numbers followed by ordinals you could use this one:

[0-9]+(?:st| st|nd| nd|rd| rd|th| th)

So for a given text: "The 3rd person is missing but the 2 nd and the 1st is here" you'll have this output: "The person is missing but the and the is here"

like image 20
Raul Urcan Avatar answered Nov 19 '22 06:11

Raul Urcan


Try a negative lookbehind:

(?<=[0-9])(?:st|nd|rd|th)

assuming the dialect of regex supports it.

like image 1
MRAB Avatar answered Nov 19 '22 04:11

MRAB