I am looking at a log message with the following format
datetime log_message_type message_type server {json_string}
So each line is separated by whitespace, always has the same fields for each line, and at the end has a json string with a variety of fields inside the json block.
I thought about doing this with a simple
with open('test.log', 'r') as f:
for x in f:
line = x.split()
datetime = line[0]
log_message_type = line[1]
message_type = line[2]
server = line[3]
json_string = line[4]
This would have worked, except there are spaces in my json string, for example, something like this.
{ "foo" : "bar" }
So doing it in this way would split up my json string at the spaces. Is there any way I could use a regex or something to split on whitespace only until I get to the "json string" section of the line, and then preserve the rest of it? I tried doing something like
line = re.compile(".*\s.*\s.*\s.*\s").split(x)
To attempt to parse the line based on the 4 spaces before the json string portion, but I'm afraid I just don't know enough about how the regex system in python works. Could anyone give me a hand?
Edit : forgot to mention, I'm stuck with python 2.7 for this.
limit the number of splits:
line = x.split(maxsplit=4)
>>> "a b c d my json expression".split(maxsplit=4)
['a', 'b', 'c', 'd', 'my json expression']
Note: python 2 arguments differ, you'd have to pass then as positional (also works with python 3 BTW):
line = x.split(None,4)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With