Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching nonempty lines with pyparsing

I am trying to make a small application which uses pyparsing to extract data from files produced by another program.

These files have following format.

SOME_KEYWORD:
line 1
line 2
line 3
line 4

ANOTHER_KEYWORD:
line a
line b
line c

How can i construct grammar which will help to extract line 1, line 2 ... line 4 and line a .. line c? I am trying to make a construction like this

Grammar = Keyword("SOME_KEYWORD:").supress() + NonEmptyLines + EmptyLine.supress() +\
         Keyword("ANOTHER_KEYWORD:").supress() + NonEmptyLines + EmptyLine.supress()

But i don't know how to define NonEmptyLines and EmptyLine. Thanks.

like image 295
Alik Avatar asked Dec 28 '22 21:12

Alik


1 Answers

My take on it:

    from pyparsing import *

    # matches and removes end of line
    EOL = LineEnd().suppress()

    # line starts, anything follows until EOL, fails on blank lines,
    line = LineStart() + SkipTo(LineEnd(), failOn=LineStart()+LineEnd()) + EOL

    lines = OneOrMore(line)

    # Group keyword probably helps grouping these items together, you can remove it
    parser = Keyword("SOME_KEYWORD:") + EOL + Group(lines) + Keyword("ANOTHER_KEYWORD:") + EOL + Group(lines)
    result = parser.parseFile('data.txt')
    print result

Result is:

['SOME_KEYWORD:', ['line 1', 'line 2', 'line 3', 'line 4'], 'ANOTHER_KEYWORD:', ['line a', 'line b', 'line c']]
like image 154
Henry Avatar answered Jan 15 '23 07:01

Henry