I'm trying to dump data from a SQL export file with regular expression. To match the field of post content, I use '(?P<content>.*?)
'. It works fine most of the time, but if the field contains the string of '\n' the regular expression wouldn't match. How can I modify the regular expression to match them? Thanks!
Example(I'm using Python):
>>> re.findall("'(?P<content>.*?)'","'<p>something, something else</p>'")
['<p>something, something else</p>']
>>> re.findall("'(?P<content>.*?)'","'<p>something, \n something else</p>'")
[]
P.S. Seemingly all strings with '\' in the front are treated as escape characters. How can I tell regx to treat them as they are?
You should use DOTALL
option:
>>> re.findall("'(?P<content>.*?)'","'<p>something, \n something else</p>'", re.DOTALL)
['<p>something, \n something else</p>']
See this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With