Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression to extract part of string

I have a string in the form of

Foo
"Foo"
"Some Foo"
"Some Foo and more"

I need to extract the value Foo which is in quotes and can be surrounded by any number of alphanumeric and white space characters. So, for the examples above I would like the output to be

<NoMatch>
Foo
Foo
Foo

I have been trying to get this to work, and this is the pattern I have so far using lookahead/lookbehind for quotes. This works for "Foo" but not others.

(?<=")Foo(?=")

Further expanding this to

(?<=")(?<=.*?)Foo(?=.*?)(?=")

does not work.

Any assistance will be appreciated!

like image 531
Kami Avatar asked May 24 '13 10:05

Kami


2 Answers

If quotes are correctly balanced and quoted strings don't span multiple lines, then you can simply look ahead in the string to check whether an even number of quotes follows. If that's not true, we know that we're inside a quoted string:

Foo(?![^"\r\n]*(?:"[^"\r\n]*"[^"\r\n]*)*$)

Explanation:

Foo          # Match Foo
(?!          # only if the following can't be matched here:
 [^"\r\n]*   # Any number of characters except quotes or newlines
 (?:         # followed by
  "[^"\r\n]* # (a quote and any number of non-quotes/newlines
  "[^"\r\n]* # twice)
 )*          # any number of times.
 $           # End of the line
)            # End of lookahead assertion

See it live on regex101.com

like image 152
Tim Pietzcker Avatar answered Oct 27 '22 13:10

Tim Pietzcker


Look-around ((?<=something) and (?=something)) don't work on variable-lenght patterns, i.e., on .*. Try this:

(?<=")(.*?)(Foo)(.*?)(?=")

and then use match strings (depending on your language: $1,$2,... or \1,\2,... or members of some array or something like that).

like image 2
Vedran Šego Avatar answered Oct 27 '22 11:10

Vedran Šego