matching any character including newlines in a Python regex subexpression, not globally

2 Answers

To match a newline, or "any symbol" without re.S/re.DOTALL, you may use any of the following:

(?s:.) - the inline modifier group with s flag on sets a scope where all . patterns match any char including line break chars
Any of the following work-arounds:

[\s\S] [\w\W] [\d\D]

The main idea is that the opposite shorthand classes inside a character class match any symbol there is in the input string.

Comparing it to (.|\s) and other variations with alternation, the character class solution is much more efficient as it involves much less backtracking (when used with a * or + quantifier). Compare the small example: it takes (?:.|\n)+ 45 steps to complete, and it takes [\s\S]+ just 2 steps.

See a Python demo where I am matching a line starting with 123 and up to the first occurrence of 3 at the start of a line and including the rest of that line:

import re text = """abc 123 def 356 more text...""" print( re.findall(r"^123(?s:.*?)^3.*", text, re.M) ) # => ['123\ndef\n356'] print( re.findall(r"^123[\w\W]*?^3.*", text, re.M) ) # => ['123\ndef\n356']

176

answered Oct 06 '22 13:10

Wiktor Stribiżew

Match any character (including new line):

Regular Expression: (Note the use of space ' ' is also there)

[\S\n\t\v ]

Example:

import re  text = 'abc def ###A quick brown fox.\nIt jumps over the lazy dog### ghi jkl' # We want to extract "A quick brown fox.\nIt jumps over the lazy dog" matches = re.findall('###[\S\n ]+###', text) print(matches[0])

The 'matches[0]' will contain:
'A quick brown fox.\nIt jumps over the lazy dog'

Description of '\S' Python docs:

\S Matches any character which is not a whitespace character.

( See: https://docs.python.org/3/library/re.html#regular-expression-syntax )

answered Oct 06 '22 13:10

Ali Sajjad

Related questions
                            
                                Type hinting generator in Python 3.6
                            
                                super() and @staticmethod interaction
                            
                                Can sklearn random forest directly handle categorical features?
                            
                                Python 3 bytes formatting
                            
                                Attaching a decorator to all functions within a class
                            
                                Best way to integrate Python and JavaScript?
                            
                                python dict: get vs setdefault
                            
                                Memory errors and list limits?
                            
                                assigning class variable as default value to class method argument
                            
                                Shortest way to get first item of `OrderedDict` in Python 3
                            
                                Importing installed package from script raises "AttributeError: module has no attribute" or "ImportError: cannot import name"
                            
                                What's the difference between "virtualenv" and "-m venv" in creating Virtual environments(Python)
                            
                                Python: finding uid/gid for a given username/groupname (for os.chown)
                            
                                Difference between the built-in pow() and math.pow() for floats, in Python?
                            
                                Slice indices must be integers or None or have __index__ method
                            
                                Unable log in to the django admin page with a valid username and password
                            
                                f-strings vs str.format()
                            
                                Visual Studio Code: Intellisense not working
                            
                                Parsing files (ics/ icalendar) using Python
                            
                                Best practice for setting the default value of a parameter that's supposed to be a list in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

matching any character including newlines in a Python regex subexpression, not globally

Tags:

python

regex

Jason S

People also ask