I'm using pyparser to process the output of a hex-to-text converter. It prints out 16 characters per line, separated by spaces. If the hex value is an ASCII-printable character, that character is printed, otherwise the converter outputs a period (.)
Mostly the output looks like this:
. a . v a l i d . s t r i n g .
. a n o t h e r . s t r i n g .
. e t c . . . . . . . . . . . .
My pyparsing code to describe this line is:
dump_line = 16 * Word(printables, exact=1)
This works fine, until the hex-to-text converter hits a hex value of 0x20, which causes it to output a space.
l i n e . w . a . s p a c e .
In that case, pyparsing ignores the outputted space and takes up characters from the following line to make the "quota" of 16 characters.
Can someone please suggest how I can tell pyparsing to expect 16 characters, each separated by a space, where a space can also be a valid character?
Thanks in advance. J
Since this has significant whitespace, you'll need to tell your character expression to leave leading whitespace alone. See how this is done below in the definition of dumpchar:
hexdump = """\
. a . v a l i d . s t r i n g .
. a n o t h e r . s t r i n g .
. e t c . . . . . . . . . . . .
l i n e . w . a . s p a c e .
. e t c . . . . . . . . . . . .
"""
from pyparsing import oneOf, printables, delimitedList, White, LineEnd
# expression for a single char or space
dumpchar = oneOf(list(printables)+[' ']).leaveWhitespace()
# convert '.'s to something else, if you like; in this example, '_'
dumpchar.setParseAction(lambda t:'_' if t[0]=='.' else None)
# expression for a whole line of dump chars - intervening spaces will
# be discarded by delimitedList
dumpline = delimitedList(dumpchar, delim=White(' ',exact=1)) + LineEnd().suppress()
# if you want the intervening spaces, use this form instead
#dumpline = delimitedList(dumpchar, delim=White(' ',exact=1), combine=True) + LineEnd().suppress()
# read dumped lines from hexdump
for t in dumpline.searchString(hexdump):
print ''.join(t)
Prints:
_a_valid_string_
_another_string_
_etc____________
line_w_a_ space_
_etc____________
Consider using another way to remove the spaces
>>> s=". a . v a l i d . s t r i n g ."
>>> s=s[::2]
>>> s
'.a.valid.string.'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With