Given the following json:
{
"README.rst": {
"_status": {
"md5": "952ee56fa6ce36c752117e79cc381df8"
}
},
"docs/conf.py": {
"_status": {
"md5": "6e9c7d805a1d33f0719b14fe28554ab1"
}
}
}
is there a query language that can produce:
{
"README.rst": "952ee56fa6ce36c752117e79cc381df8",
"docs/conf.py": "6e9c7d805a1d33f0719b14fe28554ab1",
}
My best attempt so far with JMESPath (http://jmespath.org/) isn't very close:
>>> jmespath.search('*.*.md5[]', db)
['952ee56fa6ce36c752117e79cc381df8', '6e9c7d805a1d33f0719b14fe28554ab1']
I've gotten to the same point with ObjectPath (http://objectpath.org):
>>> t = Tree(db)
>>> list(t.execute('$..md5'))
['952ee56fa6ce36c752117e79cc381df8', '6e9c7d805a1d33f0719b14fe28554ab1']
I couldn't make any sense of JSONiq (do I really need to read a 105 page manual to do this?) This is my first time looking at json query languages..
not sure why you want a query language this is pretty easy
def find_key(data,key="md5"):
for k,v in data.items():
if k== key: return v
if isinstance(v,dict):
result = find_key(v,key)
if result:return result
dict((k,find_key(v,"md5")) for k,v in json_result.items())
it's even easier if the value dict always has "_status" and "md5" as keys
dict((k,v["_status"]["md5"]) for k,v in json_result.items())
alternatively I think you could do something like
t = Tree(db)
>>> dict(zip(t.execute("$."),t.execute('$..md5'))
although I dont know that it would match them up quite right ...
Here is the JSONiq code that does the job:
{|
for $key in keys($document)
return {
$key: $document.$key._status.md5
}
|}
You can execute it here with the Zorba engine.
If the 105-page manual you mention is the specification, I do not recommend reading it as a JSONiq user. I would rather advise reading tutorials or books online, which give a more gentle introduction.
Do in ObjectPath:
l = op.execute("[keys($.*), $..md5]")
you'll get:
[
[
"README.rst",
"docs/conf.py"
],
[
"952ee56fa6ce36c752117e79cc381df8",
"6e9c7d805a1d33f0719b14fe28554ab1"
]
]
then in Python:
dict(zip(l[0],l[1]))
to get:
{
'README.rst': '952ee56fa6ce36c752117e79cc381df8',
'docs/conf.py': '6e9c7d805a1d33f0719b14fe28554ab1'
}
Hope that helps. :)
PS. I'm using OPs' keys() to show how to make full query that works anywhere in the document not only when keys are in the root of document.
PS2. I might add new function so that it would look like: object([keys($.*), $..md5]). Shoot me tweet http://twitter.com/adriankal if you want that.
Missed the python requirement, but if you are willing to call external program, this will still work. Please note, that jq >= 1.5 is required for this to work.
# If single "key" $p[0] has multiple md5 keys, this will reduce the array to one key.
cat /tmp/test.json | \
jq-1.5 '[paths(has("md5")?) as $p | { ($p[0]): getpath($p)["md5"]}] | add '
# this will not create single object, but you'll see all key, md5 combinations
cat /tmp/test.json | \
jq-1.5 '[paths(has("md5")?) as $p | { ($p[0]): getpath($p)["md5"]}] '
Get paths with "md5"-key '?'=ignore errors (like testing scalar for key). From resulting paths ($p) filter and surround result with '{}' = object. And then those are in an array ([] surrounding the whole expression) which is then "added/merged" together |add
https://stedolan.github.io/jq/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With