json query that returns parent element and child data?

Question

Given the following json:

{
    "README.rst": {
        "_status": {
            "md5": "952ee56fa6ce36c752117e79cc381df8"
        }
    },
    "docs/conf.py": {
        "_status": {
            "md5": "6e9c7d805a1d33f0719b14fe28554ab1"
        }
    }
}

is there a query language that can produce:

{
    "README.rst": "952ee56fa6ce36c752117e79cc381df8",
    "docs/conf.py": "6e9c7d805a1d33f0719b14fe28554ab1",
}

My best attempt so far with JMESPath (http://jmespath.org/) isn't very close:

>>> jmespath.search('*.*.md5[]', db)
['952ee56fa6ce36c752117e79cc381df8', '6e9c7d805a1d33f0719b14fe28554ab1']

I've gotten to the same point with ObjectPath (http://objectpath.org):

>>> t = Tree(db)
>>> list(t.execute('$..md5'))
['952ee56fa6ce36c752117e79cc381df8', '6e9c7d805a1d33f0719b14fe28554ab1']

I couldn't make any sense of JSONiq (do I really need to read a 105 page manual to do this?) This is my first time looking at json query languages..

Joran Beasley · Accepted Answer

not sure why you want a query language this is pretty easy

def find_key(data,key="md5"):
    for k,v in data.items():
       if k== key: return v
       if isinstance(v,dict):
          result = find_key(v,key)
          if result:return result

dict((k,find_key(v,"md5")) for k,v in json_result.items())

it's even easier if the value dict always has "_status" and "md5" as keys

dict((k,v["_status"]["md5"]) for k,v in json_result.items())

alternatively I think you could do something like

t = Tree(db)
>>> dict(zip(t.execute("$."),t.execute('$..md5'))

although I dont know that it would match them up quite right ...

Ghislain Fourny · Answer

Here is the JSONiq code that does the job:

{|
    for $key in keys($document)
    return {
        $key: $document.$key._status.md5
    }
|}

You can execute it here with the Zorba engine.

If the 105-page manual you mention is the specification, I do not recommend reading it as a JSONiq user. I would rather advise reading tutorials or books online, which give a more gentle introduction.

Adrian Kalbarczyk · Answer

Do in ObjectPath:

l = op.execute("[keys($.*), $..md5]")

you'll get:

[
  [
    "README.rst",
    "docs/conf.py"
  ],
  [
    "952ee56fa6ce36c752117e79cc381df8",
    "6e9c7d805a1d33f0719b14fe28554ab1"
  ]
]

then in Python:

dict(zip(l[0],l[1]))

to get:

{
    'README.rst': '952ee56fa6ce36c752117e79cc381df8', 
    'docs/conf.py': '6e9c7d805a1d33f0719b14fe28554ab1'
}

Hope that helps. :)

PS. I'm using OPs' keys() to show how to make full query that works anywhere in the document not only when keys are in the root of document.

PS2. I might add new function so that it would look like: object([keys($.*), $..md5]). Shoot me tweet http://twitter.com/adriankal if you want that.

Manwe · Answer

Missed the python requirement, but if you are willing to call external program, this will still work. Please note, that jq >= 1.5 is required for this to work.

# If single "key" $p[0] has multiple md5 keys, this will reduce the array to one key.
cat /tmp/test.json | \
jq-1.5 '[paths(has("md5")?) as $p | { ($p[0]): getpath($p)["md5"]}] | add '

# this will not create single object, but you'll see all key, md5 combinations
cat /tmp/test.json | \
jq-1.5 '[paths(has("md5")?) as $p | { ($p[0]): getpath($p)["md5"]}] '

Get paths with "md5"-key '?'=ignore errors (like testing scalar for key). From resulting paths ($p) filter and surround result with '{}' = object. And then those are in an array ([] surrounding the whole expression) which is then "added/merged" together |add

https://stedolan.github.io/jq/

json query that returns parent element and child data?

Tags:

python

json

jsoniq

thebjorn

4 Answers

Joran Beasley

Ghislain Fourny

Adrian Kalbarczyk

Manwe

Recent Activity

Donate For Us

json query that returns parent element and child data?

Tags:

python

json

jsoniq

thebjorn

4 Answers

Joran Beasley

Ghislain Fourny

Adrian Kalbarczyk

Manwe

Related questions

Recent Activity

Donate For Us