I am looking for a REGEX to find the first one or two capitalized words in a string. If the first two words is capitalized I want the first two words. A hyphen should be considered part of a word.
Madonna has a new album
I'm looking for madonna
Paul Young has no new album
I'm looking for Paul Young
Emmerson Lake-palmer is not here
I'm looking for Emmerson Lake-palmer
I have been using ^[A-Z]+.*?\b( [A-Z]+.*?\b){0,1}
which does great on the first two, but for the 3rd example I get Emmerson Lake
, instead of Emmerson Lake-palmer
.
What REGEX can I use to find the first one or two capitalized words in the above examples?
You may use
^[A-Z][-a-zA-Z]*(?:\s+[A-Z][-a-zA-Z]*)?
See the regex demo
Basically, use a character class [-a-zA-Z]*
instead of a dot matching pattern to only match letters and a hyphen.
Details
^
- start of string[A-Z]
- an uppercase ASCII letter[-a-zA-Z]*
- zero or more ASCII letters / hyphens(?:\s+[A-Z][-a-zA-Z]*)?
- an optional (1 or 0 due to ?
quantifier) sequence of:
\s+
- 1+ whitespace[A-Z]
- an uppercase ASCII letter[-a-zA-Z]*
- zero or more ASCII letters / hyphensA Unicode aware equivalent (for the regex flavors supporting Unicode property classes):
^\p{Lu}[-\p{L}]*(?:\s+\p{Lu}[-\p{L}]*)?
where \p{L}
matches any letter and \p{Lu}
matches any uppercase letter.
This is probably simpler:
^([A-Z][-A-Za-z]+)(\s[A-Z][-A-Za-z]+)?
Replace +
with *
if you expect single-letter words.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With