Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In python, is there a way to extract a embedded json string inside random text? [duplicate]

Tags:

python

json

So I'm parsing a really big log file with some embedded json.

So I'll see lines like this

foo="{my_object:foo, bar:baz}" a=b c=d

The problem is that since the internal json can have spaces, but outside of the JSON, spaces act as tuple delimiters (except where they have unquoted strings . Huzzah for whatever idiot thought that was a good idea), I'm not sure how to figure out where the end of the JSON string is without reimplementing large portions of a json parser.

Is there a json parser for Python where I can give it '{"my_object":"foo", "bar":"baz"} asdfasdf', and it can return ({'my_object' : 'foo', 'bar':'baz'}, 'asdfasdf') or am I going to have to reimplement the json parser by hand?

like image 825
Kevin Meyer Avatar asked Dec 31 '25 04:12

Kevin Meyer


1 Answers

Found a really cool answer. Use json.JSONDecoder's scan_once function

In [30]: import json

In [31]: d = json.JSONDecoder()

In [32]: my_string = 'key="{"foo":"bar"}"more_gibberish'

In [33]: d.scan_once(my_string, 5)
Out[33]: ({u'foo': u'bar'}, 18)

In [37]: my_string[18:]
Out[37]: '"more_gibberish'

Just be careful

In [38]: d.scan_once(my_string, 6)
Out[38]: (u'foo', 11)
like image 177
Kevin Meyer Avatar answered Jan 02 '26 18:01

Kevin Meyer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!