Python RegEx Matching Newline

Question

I have the following regular expression:

[0-9]{8}.*
.*
.*
.*
.*

Which I have tested in Expresso against the file I am working with and the match is sucessful.

I want to match the following:

Reference number 8 numbers long
Any character, any number of times
New Line
Any character, any number of times
New Line
Any character, any number of times
New Line
Any character, any number of times
New Line
Any character, any number of times

My python code is:

for m in re.findall('[0-9]{8}.*
.*
.*
.*
.*', l, re.DOTALL):
       print m

But no matches are produced, as said in Expresso there are 400+ matches which is what I would expect.

What I am missing here?

Tim Pietzcker · Accepted Answer

Don't use re.DOTALL or the dot will match newlines, too. Also use raw strings (r"...") for regexes:

for m in re.findall(r'[0-9]{8}.*
.*
.*
.*
.*', l):
   print m

However, your version still should have worked (although very inefficiently) if you have read the entire file as binary into memory as one large string.

So the question is, are you reading the file like this:

with open("filename","rb") as myfile:
    mydata = myfile.read()
    for m in re.findall(r'[0-9]{8}.*
.*
.*
.*
.*', mydata):
        print m

Or are you working with single lines (for line in myfile: or myfile.readlines())? In that case, the regex can't work, of course.

Python RegEx Matching Newline

Tags:

python

regex

humira

1 Answers

Tim Pietzcker

Recent Activity

Donate For Us

Python RegEx Matching Newline

Tags:

python

regex

humira

1 Answers

Tim Pietzcker

Related questions

Recent Activity

Donate For Us