Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursively dir() a python object to find values of a certain type or with a certain value

I have a complex Python data structure (if it matters, it's a large music21 Score object) which will not pickle due to the presence of a weakref somewhere deep inside the object structure. I've debugged such problems with the stack trace and python debugger before, but it's always a big pain. Is there a tool which runs dir() recursively on all attributes of an object, finding objects hidden in lists, tuples, dicts, etc., and returns those that match a certain value (a lambda function or something like that). A big problem is recursive references, so some sort of memo function (like copy.deepcopy uses) is needed. I tried:

import weakref
def findWeakRef(streamObj, memo=None):
    weakRefList = []
    if memo is None:
        memo = {}
    for x in dir(streamObj):
        xValue = getattr(streamObj, x)
        if id(xValue) in memo:
            continue
        else:
            memo[id(xValue)] = True
        if type(xValue) is weakref.ref:
            weakRefList.append(x, xValue, streamObj)
        if hasattr(xValue, "__iter__"):
            for i in xValue:
                if id(i) in memo:
                    pass
                else:
                    memo[id(i)] = True
                    weakRefList.extend(findWeakRef(i), memo)
        else:
            weakRefList.extend(findWeakRef(xValue), memo)
    return weakRefList

I can probably continue plugging holes in this (the iter isn't what I'd want for dicts, for instance), but before I throw more time into it, wondering if someone knows an easier answer. It could be a pretty useful general tool.

like image 501
Michael Scott Asato Cuthbert Avatar asked Sep 26 '12 22:09

Michael Scott Asato Cuthbert


People also ask

What is a recursive function in Python?

Recursive functions are functions that calls itself. It is always made up of 2 portions, the base case and the recursive case. The base case is the condition to stop the recursion. The recursive case is the part where the function calls on itself.

How do you find the data of an object in Python?

In Python, you can inspect any object with the built-in help() function.

What is recursion demonstrate with an example in Python?

Recursion is the process of defining something in terms of itself. A physical world example would be to place two parallel mirrors facing each other. Any object in between them would be reflected recursively.

How do you print all the values of an object in Python?

Use Python's vars() to Print an Object's Attributes The dir() function, as shown above, prints all of the attributes of a Python object. Let's say you only wanted to print the object's instance attributes as well as their values, we can use the vars() function.


2 Answers

This seems to be the start of an answer. I had to backport some items from the Python 3.2 inspect.getattr_static to make it work so it didn't call properties that just kept generating new objects. Here's the code I came up with:

#-------------------------------------------------------------------------------
# Name:         treeYield.py
# Purpose:      traverse a complex datastructure and yield elements
#               that fit a given criteria
#
# Authors:      Michael Scott Cuthbert
#
# Copyright:    Copyright © 2012 Michael Scott Cuthbert
# License:      CC-BY
#-------------------------------------------------------------------------------
import types

class TreeYielder(object):
    def __init__(self, yieldValue = None):
        '''
        `yieldValue` should be a lambda function that
        returns True/False or a function/method call that
        will be passed the value of a current attribute
        '''        
        self.currentStack = []
        self.yieldValue = yieldValue
        self.stackVals = []
        t = types
        self.nonIterables = [t.IntType, t.StringType, t.UnicodeType, t.LongType,
                             t.FloatType, t.NoneType, t.BooleanType]

    def run(self, obj, memo = None):
        '''
        traverse all attributes of an object looking
        for subObjects that meet a certain criteria.
        yield them.

        `memo` is a dictionary to keep track of objects
        that have already been seen

        The original object is added to the memo and
        also checked for yieldValue
        '''
        if memo is None:
            memo = {}
        self.memo = memo
        if id(obj) in self.memo:
            self.memo[id(obj)] += 1
            return
        else:
            self.memo[id(obj)] = 1

        if self.yieldValue(obj) is True:
            yield obj


        ### now check for sub values...
        self.currentStack.append(obj)

        tObj = type(obj)
        if tObj in self.nonIterables:
            pass
        elif tObj == types.DictType:
            for keyX in obj:
                dictTuple = ('dict', keyX)
                self.stackVals.append(dictTuple)
                x = obj[keyX]
                for z in self.run(x, memo=memo):
                    yield z
                self.stackVals.pop()

        elif tObj in [types.ListType, types.TupleType]:
            for i,x in enumerate(obj):
                listTuple = ('listLike', i)
                self.stackVals.append(listTuple)
                for z in self.run(x, memo=memo):
                    yield z
                self.stackVals.pop()

        else: # objects or uncaught types...
            ### from http://bugs.python.org/file18699/static.py
            try:
                instance_dict = object.__getattribute__(obj, "__dict__")
            except AttributeError:
                ## probably uncaught static object
                return

            for x in instance_dict:
                try:
                    gotValue = object.__getattribute__(obj, x)
                except: # ?? property that relies on something else being set.
                    continue
                objTuple = ('getattr', x)
                self.stackVals.append(objTuple)
                try:
                    for z in self.run(gotValue, memo=memo):
                        yield z
                except RuntimeError:
                    raise Exception("Maximum recursion on:\n%s" % self.currentLevel())
                self.stackVals.pop()                

        self.currentStack.pop()

    def currentLevel(self):
        currentStr = ""
        for stackType, stackValue in self.stackVals:
            if stackType == 'dict':
                if isinstance(stackValue, str):
                    currentStr += "['" + stackValue + "']"
                elif isinstance(stackValue, unicode):
                    currentStr += "[u'" + stackValue + "']"
                else: # numeric key...
                    currentStr += "[" + str(stackValue) + "]"
            elif stackType == 'listLike':
                currentStr += "[" + str(stackValue) + "]"
            elif stackType == 'getattr':
                currentStr += ".__getattribute__('" + stackValue + "')"
            else:
                raise Exception("Cannot get attribute of type %s" % stackType)
        return currentStr

This code lets you run something like this:

class Mock(object):
    def __init__(self, mockThing, embedMock = True):
        self.abby = 30
        self.mocker = mockThing
        self.mockList = [mockThing, mockThing, 40]
        self.embeddedMock = None
        if embedMock is True:
            self.embeddedMock = Mock(mockThing, embedMock = False)

mockType = lambda x: x.__class__.__name__ == 'Mock'

subList = [100, 60, -2]
myList = [5, 20, [5, 12, 17], 30, {'hello': 10, 'goodbye': 22, 'mock': Mock(subList)}, -20, Mock(subList)]
myList.append(myList)

ty = TreeYielder(mockType)
for val in ty.run(myList):
    print(val, ty.currentLevel())

And get:

(<__main__.Mock object at 0x01DEBD10>, "[4]['mock']")
(<__main__.Mock object at 0x01DEF370>, "[4]['mock'].__getattribute__('embeddedMock')")
(<__main__.Mock object at 0x01DEF390>, '[6]')
(<__main__.Mock object at 0x01DEF3B0>, "[6].__getattribute__('embeddedMock')")

Or run:

high = lambda x: isinstance(x, (int, float)) and x > 10
ty = TreeYielder(high)
for val in ty.run(myList):
    print(val, ty.currentLevel())

And get:

(20, '[1]')
(12, '[2][1]')
(17, '[2][2]')
(30, '[3]')
(22, "[4]['goodbye']")
(100, "[4]['mock'].__getattribute__('embeddedMock').__getattribute__('mocker')[0]")
(60, "[4]['mock'].__getattribute__('embeddedMock').__getattribute__('mocker')[1]")
(40, "[4]['mock'].__getattribute__('embeddedMock').__getattribute__('mockList')[2]")

I'm still trying to figure out why .abby isn't found, but I figure it's worth posting even at this point, since it's much more on the right track than I was when I started.

like image 132
Michael Scott Asato Cuthbert Avatar answered Sep 25 '22 12:09

Michael Scott Asato Cuthbert


Here is a simpler solution that is somewhat naive. I.e. just a depth first search under the attribute tree. If its a primitive then stop , otherwise go deeper in the tree. It will get you the call tree and the value at the leaf.

def recursive_dir(obj, path):
    if ((obj!=None) and (not isinstance(obj, (str,float,int,list,dict,set)))):
        for attr, val in obj.__dict__.iteritems():
            temp_path = path[:]
            temp_path.append(attr)
            recursive_dir(getattr(obj, attr), temp_path)
    else:
        print (path, "--->", obj)
        print("")
recursive_dir(x,[])
like image 25
SriK Avatar answered Sep 21 '22 12:09

SriK