Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex nonconsecutive match

Tags:

regex

dart

I'm trying to match a word that has 2 vowels in it (doesn't have to be consecutively) but the regex I've come up either matches nothing or not enough. This is the last iteration (dart).

  final vowelRegex = new RegExp(r'[aeiouy]{2}');

Here's an example sentence being parsed and it should match, one, shoulder, their, and over. It's only matching shoulder and their. I understand why, because that's the expression I defined. How can the expression be defined to match on 2 vowels, regardless of position in the word?

  one shoulder their the which over

The expression only needs to be tested on one word at a time so hopefully this simplifies things.

like image 959
Will Lopez Avatar asked Jan 05 '23 19:01

Will Lopez


2 Answers

You can use :

new RegExp(r'(\w*[aeiouy]\w*){2}');
like image 109
Alexandre Ardhuin Avatar answered Jan 10 '23 21:01

Alexandre Ardhuin


Both of the previous two answers are incorrect.

(\S*[aeiouy]\S*){2} can match substrings of non-whitespace characters even if they contain non-word characters (proof).

\S*[aeiouy]\S*[aeiouy]\S* has the same problem (proof).


Correct solution:

\b([^\Waeiou]*[aeiou]){2}\w*\b

And if you want only whitespace to count as the word boundary (rather than any non-word character), then use the following regex where the target word is in capture group \2.

(\s|^)(([^\Waeiou]*[aeiou]){2}\w*)(\s|$)
like image 30
Travis Avatar answered Jan 10 '23 20:01

Travis