Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pcre regex to match first two words, numbers

I need a regular expression to match only the first two words (they may contain letters , numbers, commas and other punctuation but not white spaces, tabs or new lines) in a string. My solution is ([^\s]+\s+){2} but if it matches something like :'123 word' *in '123 word, hello'*, it doesnt work on a string with just two words and no spaces after.

What is the right regex for this task?

like image 753
Boris D. Teoharov Avatar asked Oct 12 '12 21:10

Boris D. Teoharov


People also ask

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.

How do I match a range of numbers in regex?

With regex you have a couple of options to match a digit. You can use a number from 0 to 9 to match a single choice. Or you can match a range of digits with a character group e.g. [4-9]. If the character group allows any digit (i.e. [0-9]), it can be replaced with a shorthand (\d).

What does ?= Mean in regex?

?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).


2 Answers

You have it almost right:

([^\s]+\s+[^\s]+)

Assuming you don't need stronger control on what characters to use.

If you need to match both two words or only one word only, you may use one of those:

([^\s]+\s+[^\s]+|[^\s]+)
([^\s]+(?:\s+[^\s]+)?)
like image 136
Vyktor Avatar answered Sep 19 '22 22:09

Vyktor


Instead of trying to match the words, you could split the string on whitespace with preg_split().

like image 33
Brian Avatar answered Sep 18 '22 22:09

Brian