I want to find every line which consists only of letters a, b and c. I've got the regular expression
print(re.findall('^[abc]+$', text))
but I get no result back from this text:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
fsadfasd
quis nostraud exercitatione ullamco laboiris nisi ut aloiquip ex ea commuodo consequat.
gfgfgasdas
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
aaaabbbbcccaabcccabc
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
aabcbcbcbbabbbabcbbcbcf
culpa qui ofaeiouficia deserunt mollit anim id est laborum.
bbcbcbcbcbcbcbcbcbcbcbcbcbc
aeiou
aaaaaaaaaaaaaaaaaaaaaaaa
Why is this? I think the problem is with the ^ and $ characters, but I don't understand why.
You want to find every line that consists of only these letters. So, search over the lines with re.MULTILINE:
print(re.findall('^[abc]+$', text, re.MULTILINE))
Without this flag, re will treat text as a single line, and ^ and $ will refer to the beginning and end of the whole content of the file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With