I'm trying to use Python RegEx re.sub
to remove a colon before the antepenultimate vowel [aeiou]
of a word if the antepenultimate vowel (from the end) is preceded by another vowel.
So the colon has to be between the 3rd and 4th vowel counting from the end of the word.
So the 1st example given would break down like this w4:32ny1h
.
we:aanyoh > weaanyoh # w4:32ny1h
hiru:atghigu > hiruatghigu
yo:ubeki > youbeki
Below is the RegEx statement I'm trying to use but I can't get it to work.
word = re.sub(ur"([aeiou]):([aeiou])(([^aeiou])*([aeiou])*([aeiou])([^aeiou])*([aeiou]))$", ur'\1\2\3\4', word)
Don't you just have too many parentheses (and other extra stuff)?:
word = re.sub(ur"([aeiou]):(([aeiou][^aeiou]*){3})$", ur'\1\2', word)
Not sure if you want to completely ignore consonants; this regex will. Otherwise similar to Jeff's.
import re
tests = [
'we:aanyoh',
'hiru:atghigu',
'yo:ubeki',
'yo:ubekiki',
'yo:ubek'
]
for word in tests:
s = re.sub(r'([^aeiou]*[aeiou][^aeiou]*):((?:[^aeiou]*[aeiou]){3}[^aeiou]*)$', r'\1\2', word)
print '{} > {}'.format(word, s)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With