re.search with \s or '\n' is not finding the multiline i'm trying to search for.
Portion of Source:
Date/Time:
2013-08-27 17:05:36
----- BEGIN SEARCH -----
GENERAL DATA:
NAME: AB12
SECTOR:
999,999
CONTROLLED BY: Player
ALLIANCE: Aliance
ONLINE: 1 seconds ago
SIZE: Large
HOMEWORLD: NO
APPROVAL RATING: 100%
PRODUCTION RATE: 100%
RESOURCE DATA:
POWER: 0 / 0
BUILDINGS: 0 / 20
ORE: 80,000 / 80,000
CRYSTAL: 80,000 / 80,000
POPULATION: 40,000 / 40,000
BUILDING DATA:
N/A
UNIT DATA:
WYVERN(S): 100
----- END SEARCH -----
Looking at it in Notepad++ I see "BUILDING DATA:(LF)"
Full Code
lines = open('scan.txt','r').readlines()
for a in lines:
if re.search(r"\A\d", a):
digits = a
if re.search(r"2013", digits):
date.append(digits[:19])
count +=1
elif re.search(r",", digits):
clean = digits.rstrip()
sector = clean.split(',')
x.append(sector[0])
y.append(sector[1])
elif re.search(r"CONTROLLED BY:", a):
player.append(a[15:].rstrip())
elif re.search(r"ALLIANCE:", a):
alliance.append(a[10:].rstrip())
elif re.search(r"SIZE:", a):
size.append(a[6:].rstrip())
elif re.findall('BUILDING DATA:\sN/A', a, re.M):
def_grid = ''
print "Didn't find it"
defense.append(def_grid)
defense_count +=1
elif re.search(r"DEFENSE GRID", a):
def_grid = a[16:].rstrip()
print "defense found"
defense_count +=1
But I am not having anything returned.
I need to put an empty spacer in when "DEFENSE GRID" doesn't exist after "BUILDING DATA:"
I know i'm missing something and I've tried reading up on re.search but i'm not able to find any thorough examples that explain how the multiline works.
The re. MULTILINE flag tells python to make the '^' and '$' special characters match the start or end of any line within a string. Using this flag: >>> match = re.search(r'^It has.
DOTALL flag tells python to make the '. ' special character match all characters, including newline characters. This is a paragraph. It has multiple lines.
re. re.MULTILINE. The re. MULTILINE search modifier forces the ^ symbol to match at the beginning of each line of text (and not just the first), and the $ symbol to match at the end of each line of text (and not just the last one).
The m flag indicates that a multiline input string should be treated as multiple lines. For example, if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string.
re.findall("BUILDING DATA:\nN/A",a,re.MULTILINE)
You can do just what you did, but using re.findall
instead of re.search
:
re.findall('BUILDING DATA:\nN/A', a, re.M)
#['BUILDING DATA:\nN/A']
EDIT:
The problem is that you are currently reading line-by-line. In order to detect a pattern that belongs to two or more lines, you have to consider the string as a whole, maybe doing:
s = ''.join(lines)
which is ok if lines
is not so big, and then use s
to perform your multi-line searches...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With