Let assume that I have some string: "Lorem ipsum dolor sit amet" I need a list of all words with lenght more than 3. Can I do it with regular expressions?
e.g.
pattern = re.compile(r'some pattern')
result = pattern.search('Lorem ipsum dolor sit amet').groups()
result contains 'Lorem', 'ipsum', 'dolor' and 'amet'.
EDITED:
The words I mean can only contains letters and numbers.
>>> import re
>>> myre = re.compile(r"\w{4,}")
>>> myre.findall('Lorem, ipsum! dolor sit? amet...')
['Lorem', 'ipsum', 'dolor', 'amet']
Take note that in Python 3, where all strings are Unicode, this will also find words that use non-ASCII letters:
>>> import re
>>> myre = re.compile(r"\w{4,}")
>>> myre.findall('Lorem, ipsum! dolör sit? amet...')
['Lorem', 'ipsum', 'dolör', 'amet']
In Python 2, you'd have to use
>>> myre = re.compile(r"\w{4,}", re.UNICODE)
>>> myre.findall(u'Lorem, ipsum! dolör sit? amet...')
[u'Lorem', u'ipsum', u'dol\xf6r', u'amet']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With