I need a regular expression to match only the first two words (they may contain letters , numbers, commas and other punctuation but not white spaces, tabs or new lines) in a string.
My solution is ([^\s]+\s+){2}
but if it matches something like :'123 word' *in '123 word, hello'*, it doesnt work on a string with just two words and no spaces after.
What is the right regex for this task?
Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.
With regex you have a couple of options to match a digit. You can use a number from 0 to 9 to match a single choice. Or you can match a range of digits with a character group e.g. [4-9]. If the character group allows any digit (i.e. [0-9]), it can be replaced with a shorthand (\d).
?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).
You have it almost right:
([^\s]+\s+[^\s]+)
Assuming you don't need stronger control on what characters to use.
If you need to match both two words or only one word only, you may use one of those:
([^\s]+\s+[^\s]+|[^\s]+)
([^\s]+(?:\s+[^\s]+)?)
Instead of trying to match the words, you could split the string on whitespace with preg_split()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With