Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

struggling with conditional regular expressions

Tags:

regex

ruby

I have a simple problem, yet i am unable to solve this. Either my string has a format ID: dddd, with the following regular expression:

/^ID: ([a-z0-9]*)$/

or as follows: ID: 1234 Status: 232, so with the following regular expression:

/^ID: ([a-z0-9]*) Status: ([a-z0-9]*)$/

Now i want to make one regular expression that can handle both. The first thing i came up with was this:

/^ID: ([a-z0-9]*)$|^ID: ([a-z0-9]*) Status: ([a-z0-9]*)$/

It matches, but i was looking into conditional regular expressions, and was thinking that something should be possible along the lines of (pseudo-codish)

if the string contains /Status:/
    /^ID: ([a-z0-9]*)$/
else
    /^ID: ([a-z0-9]*) Status: ([a-z0-9]*)$/

only, i can't get this expressed correctly. I thought i should be using /?=/ but have no clue how. Something like

/((?=Status)^ID: ([a-z0-9]*) Status: ([a-z0-9]*)$|^ID: ([a-z0-9]*)$/

but that doesn't work.

Can you help?

like image 687
nathanvda Avatar asked Aug 23 '10 15:08

nathanvda


People also ask

Why are regular expressions so complicated?

Regular expressions are dense. This makes them hard to read, but not in proportion to the information they carry. Certainly 100 characters of regular expression syntax is harder to read than 100 consecutive characters of ordinary prose or 100 characters of C code.

What does ?= Mean in regex?

?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).


3 Answers

You're looking for ^ID: (\d+)(?: Status: (\d+))?$

edit: Since the question is tagged Ruby it's worth mentioning that according to both this question and this flavour-comparison, Ruby doesn't do conditional regex.

http://www.regular-expressions.info is a great source on the subject.

like image 145
Chris Wesseling Avatar answered Sep 28 '22 02:09

Chris Wesseling


You need a .* in your lookahead: (Rubular)

/(?=.*Status)^ID: ([a-z0-9]*) Status: ([a-z0-9]*)$|^ID: ([a-z0-9]*)$/

However, for your specific example you don't need a lookahead. You can just use the ? quantifier instead: (Rubular)

/^ID: ([a-z0-9]*)(?: Status: ([a-z0-9]*))?$/
like image 32
Mark Byers Avatar answered Sep 28 '22 00:09

Mark Byers


Interestingly, according to this question, Ruby 1.8/1.9 does not support conditional regular expressions.

Have you (or any of the answerers) read otherwise? If so, it might be helpful to update the linked question so that it no longer gives incorrect information.

like image 37
jerhinesmith Avatar answered Sep 28 '22 00:09

jerhinesmith