Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Xpath like query for nested python dictionaries

Is there a way to define a XPath type query for nested python dictionaries.

Something like this:

foo = {
  'spam':'eggs',
  'morefoo': {
               'bar':'soap',
               'morebar': {'bacon' : 'foobar'}
              }
   }

print( foo.select("/morefoo/morebar") )

>> {'bacon' : 'foobar'}

I also needed to select nested lists ;)

This can be done easily with @jellybean's solution:

def xpath_get(mydict, path):
    elem = mydict
    try:
        for x in path.strip("/").split("/"):
            try:
                x = int(x)
                elem = elem[x]
            except ValueError:
                elem = elem.get(x)
    except:
        pass

    return elem

foo = {
  'spam':'eggs',
  'morefoo': [{
               'bar':'soap',
               'morebar': {
                           'bacon' : {
                                       'bla':'balbla'
                                     }
                           }
              },
              'bla'
              ]
   }

print xpath_get(foo, "/morefoo/0/morebar/bacon")

[EDIT 2016] This question and the accepted answer are ancient. The newer answers may do the job better than the original answer. However I did not test them so I won't change the accepted answer.

like image 700
RickyA Avatar asked Sep 06 '11 13:09

RickyA


People also ask

How do I access nested dictionary items?

To access element of a nested dictionary, we use indexing [] syntax in Python.

How do you iterate through nested dictionaries?

Iterate over all values of a nested dictionary in python For a normal dictionary, we can just call the items() function of dictionary to get an iterable sequence of all key-value pairs.

How do you get an item from a nested dictionary in Python?

You can access individual items in a nested dictionary by specifying key in multiple square brackets. If you refer to a key that is not in the nested dictionary, an exception is raised. To avoid such exception, you can use the special dictionary get() method.

Are nested dictionaries bad practice?

There is nothing inherently wrong with nested dicts. Anything can be a dict value, and it can make sense for a dict to be one. A lot of the time when people make nested dicts, their problems could be solved slightly more easily by using a dict with tuples for keys.


9 Answers

One of the best libraries I've been able to identify, which, in addition, is very actively developed, is an extracted project from boto: JMESPath. It has a very powerful syntax of doing things that would normally take pages of code to express.

Here are some examples:

search('foo | bar', {"foo": {"bar": "baz"}}) -> "baz"
search('foo[*].bar | [0]', {
    "foo": [{"bar": ["first1", "second1"]},
            {"bar": ["first2", "second2"]}]}) -> ["first1", "second1"]
search('foo | [0]', {"foo": [0, 1, 2]}) -> [0]
like image 185
nikolay Avatar answered Oct 02 '22 01:10

nikolay


There is an easier way to do this now.

http://github.com/akesterson/dpath-python

$ easy_install dpath
>>> dpath.util.search(YOUR_DICTIONARY, "morefoo/morebar")

... done. Or if you don't like getting your results back in a view (merged dictionary that retains the paths), yield them instead:

$ easy_install dpath
>>> for (path, value) in dpath.util.search(YOUR_DICTIONARY, "morefoo/morebar", yielded=True)

... and done. 'value' will hold {'bacon': 'foobar'} in that case.

like image 37
Andrew Kesterson Avatar answered Oct 02 '22 01:10

Andrew Kesterson


Not exactly beautiful, but you might use sth like

def xpath_get(mydict, path):
    elem = mydict
    try:
        for x in path.strip("/").split("/"):
            elem = elem.get(x)
    except:
        pass

    return elem

This doesn't support xpath stuff like indices, of course ... not to mention the / key trap unutbu indicated.

like image 23
Johannes Charra Avatar answered Oct 02 '22 00:10

Johannes Charra


There is the newer jsonpath-rw library supporting a JSONPATH syntax but for python dictionaries and arrays, as you wished.

So your 1st example becomes:

from jsonpath_rw import parse

print( parse('$.morefoo.morebar').find(foo) )

And the 2nd:

print( parse("$.morefoo[0].morebar.bacon").find(foo) )

PS: An alternative simpler library also supporting dictionaries is python-json-pointer with a more XPath-like syntax.

like image 37
ankostis Avatar answered Oct 02 '22 01:10

ankostis


dict > jmespath

You can use JMESPath which is a query language for JSON, and which has a python implementation.

import jmespath # pip install jmespath

data = {'root': {'section': {'item1': 'value1', 'item2': 'value2'}}}

jmespath.search('root.section.item2', data)
Out[42]: 'value2'

The jmespath query syntax and live examples: http://jmespath.org/tutorial.html

dict > xml > xpath

Another option would be converting your dictionaries to XML using something like dicttoxml and then use regular XPath expressions e.g. via lxml or whatever other library you prefer.

from dicttoxml import dicttoxml  # pip install dicttoxml
from lxml import etree  # pip install lxml

data = {'root': {'section': {'item1': 'value1', 'item2': 'value2'}}}
xml_data = dicttoxml(data, attr_type=False)
Out[43]: b'<?xml version="1.0" encoding="UTF-8" ?><root><root><section><item1>value1</item1><item2>value2</item2></section></root></root>'

tree = etree.fromstring(xml_data)
tree.xpath('//item2/text()')
Out[44]: ['value2']

Json Pointer

Yet another option is Json Pointer which is an IETF spec that has a python implementation:

  • https://github.com/stefankoegl/python-json-pointer

From the jsonpointer-python tutorial:

from jsonpointer import resolve_pointer

obj = {"foo": {"anArray": [ {"prop": 44}], "another prop": {"baz": "A string" }}}

resolve_pointer(obj, '') == obj
# True

resolve_pointer(obj, '/foo/another%20prop/baz') == obj['foo']['another prop']['baz']
# True

>>> resolve_pointer(obj, '/foo/anArray/0') == obj['foo']['anArray'][0]
# True

like image 25
ccpizza Avatar answered Oct 01 '22 23:10

ccpizza


If terseness is your fancy:

def xpath(root, path, sch='/'):
    return reduce(lambda acc, nxt: acc[nxt],
                  [int(x) if x.isdigit() else x for x in path.split(sch)],
                  root)

Of course, if you only have dicts, then it's simpler:

def xpath(root, path, sch='/'):
    return reduce(lambda acc, nxt: acc[nxt],
                  path.split(sch),
                  root)

Good luck finding any errors in your path spec tho ;-)

like image 44
d1zzyg Avatar answered Oct 02 '22 01:10

d1zzyg


Another alternative (besides that suggested by jellybean) is this:

def querydict(d, q):
  keys = q.split('/')
  nd = d
  for k in keys:
    if k == '':
      continue
    if k in nd:
      nd = nd[k]
    else:
      return None
  return nd

foo = {
  'spam':'eggs',
  'morefoo': {
               'bar':'soap',
               'morebar': {'bacon' : 'foobar'}
              }
   }
print querydict(foo, "/morefoo/morebar")
like image 43
MarcoS Avatar answered Oct 01 '22 23:10

MarcoS


More work would have to be put into how the XPath-like selector would work. '/' is a valid dictionary key, so how would

foo={'/':{'/':'eggs'},'//':'ham'}

be handled?

foo.select("///")

would be ambiguous.

like image 40
unutbu Avatar answered Oct 01 '22 23:10

unutbu


Is there any reason for you to the query it the way like the XPath pattern? As the commenter to your question suggested, it just a dictionary, so you can access the elements in a nest manner. Also, considering that data is in the form of JSON, you can use simplejson module to load it and access the elements too.

There is this project JSONPATH, which is trying to help people do opposite of what you intend to do (given an XPATH, how to make it easily accessible via python objects), which seems more useful.

like image 32
Senthil Kumaran Avatar answered Oct 02 '22 00:10

Senthil Kumaran