Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

re.search Multiple lines Python

re.search with \s or '\n' is not finding the multiline i'm trying to search for.

Portion of Source:

Date/Time:
2013-08-27 17:05:36 

----- BEGIN SEARCH -----

GENERAL DATA:
NAME:   AB12
SECTOR: 
999,999
CONTROLLED BY:  Player
ALLIANCE:   Aliance
ONLINE: 1 seconds ago
SIZE:   Large
HOMEWORLD:  NO
APPROVAL RATING:    100%
PRODUCTION RATE:    100%

RESOURCE DATA:
POWER:  0 / 0
BUILDINGS:  0 / 20
ORE:    80,000 / 80,000
CRYSTAL:    80,000 / 80,000
POPULATION: 40,000 / 40,000

BUILDING DATA:
N/A

UNIT DATA:
WYVERN(S):  100

----- END SEARCH -----

Looking at it in Notepad++ I see "BUILDING DATA:(LF)"

Full Code

lines = open('scan.txt','r').readlines()
for a in lines:
    if re.search(r"\A\d", a):
        digits = a
        if re.search(r"2013", digits):
            date.append(digits[:19])
            count +=1
        elif re.search(r",", digits):
            clean = digits.rstrip()
            sector = clean.split(',')
            x.append(sector[0])
            y.append(sector[1])
    elif re.search(r"CONTROLLED BY:", a):
        player.append(a[15:].rstrip())
    elif re.search(r"ALLIANCE:", a):
        alliance.append(a[10:].rstrip())
    elif re.search(r"SIZE:", a):
        size.append(a[6:].rstrip())
    elif re.findall('BUILDING DATA:\sN/A', a, re.M):
        def_grid = ''
        print "Didn't find it"
        defense.append(def_grid)
        defense_count +=1
    elif re.search(r"DEFENSE GRID", a):
        def_grid = a[16:].rstrip()
        print "defense found"
        defense_count +=1

But I am not having anything returned.

I need to put an empty spacer in when "DEFENSE GRID" doesn't exist after "BUILDING DATA:"

I know i'm missing something and I've tried reading up on re.search but i'm not able to find any thorough examples that explain how the multiline works.

like image 312
Xariec Avatar asked Aug 29 '13 21:08

Xariec


People also ask

How do you match multiple lines in Python?

The re. MULTILINE flag tells python to make the '^' and '$' special characters match the start or end of any line within a string. Using this flag: >>> match = re.search(r'^It has.

Which flag will search over multiple lines in Python?

DOTALL flag tells python to make the '. ' special character match all characters, including newline characters. This is a paragraph. It has multiple lines.

What is re multiline?

re. re.MULTILINE. The re. MULTILINE search modifier forces the ^ symbol to match at the beginning of each line of text (and not just the first), and the $ symbol to match at the end of each line of text (and not just the last one).

Which flag will search over multiple lines?

The m flag indicates that a multiline input string should be treated as multiple lines. For example, if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string.


2 Answers

re.findall("BUILDING DATA:\nN/A",a,re.MULTILINE)
like image 63
Goontracker Avatar answered Sep 25 '22 11:09

Goontracker


You can do just what you did, but using re.findall instead of re.search:

re.findall('BUILDING DATA:\nN/A', a, re.M)
#['BUILDING DATA:\nN/A']

EDIT:

The problem is that you are currently reading line-by-line. In order to detect a pattern that belongs to two or more lines, you have to consider the string as a whole, maybe doing:

s = ''.join(lines)

which is ok if lines is not so big, and then use s to perform your multi-line searches...

like image 31
Saullo G. P. Castro Avatar answered Sep 25 '22 11:09

Saullo G. P. Castro