I am trying to make a small application which uses pyparsing
to extract data from files produced by another program.
These files have following format.
SOME_KEYWORD:
line 1
line 2
line 3
line 4
ANOTHER_KEYWORD:
line a
line b
line c
How can i construct grammar which will help to extract line 1
, line 2
... line 4
and line a
.. line c
?
I am trying to make a construction like this
Grammar = Keyword("SOME_KEYWORD:").supress() + NonEmptyLines + EmptyLine.supress() +\
Keyword("ANOTHER_KEYWORD:").supress() + NonEmptyLines + EmptyLine.supress()
But i don't know how to define NonEmptyLines
and EmptyLine
.
Thanks.
My take on it:
from pyparsing import *
# matches and removes end of line
EOL = LineEnd().suppress()
# line starts, anything follows until EOL, fails on blank lines,
line = LineStart() + SkipTo(LineEnd(), failOn=LineStart()+LineEnd()) + EOL
lines = OneOrMore(line)
# Group keyword probably helps grouping these items together, you can remove it
parser = Keyword("SOME_KEYWORD:") + EOL + Group(lines) + Keyword("ANOTHER_KEYWORD:") + EOL + Group(lines)
result = parser.parseFile('data.txt')
print result
Result is:
['SOME_KEYWORD:', ['line 1', 'line 2', 'line 3', 'line 4'], 'ANOTHER_KEYWORD:', ['line a', 'line b', 'line c']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With