I have a dict object from which I want to calculate the sum and the mean of specific named values.
Setup:
tree_str = """{
    'trees': [
        {
            'tree_idx': 0,
            'dimensions': (1120, 640),
            'branches': [
                'leaves': [
                    {'geometry': [[0.190673828125, 0.0859375], [0.74609375, 0.1181640625]]},
                    {'geometry': [[0.1171875, 0.1162109375], [0.8076171875, 0.15625]]}
                ],
                'leaves': [
                    {'geometry': [[0.2197265625, 0.1552734375], [0.7119140625, 0.1943359375]]},
                    {'geometry': [[0.2060546875, 0.1923828125], [0.730712890625, 0.23046875]]}
                ]
            ]
        }
    ]
}"""
tree_dict = yaml.load(tree_str, Loader=yaml.Loader)
Where:
# assume for the sake of coding
{'geometry': ((xmin, ymin), (xmax, ymax))}
# where dimensions are relative to an image of a tree
Now I have the dict object, how do I:
count of all leaves?average width and average height all leaves?I can access values and traverse the tree using:
tree_dict['trees'][0]['branches'][0]['leaves'][0]['geometry'][1][1]
So I could do this using nested for loops:
leafcount = 0
leafwidth = 0
leafheight = 0
sumleafwidth = 0
sumleafheight = 0
avgleafwidth = 0
avgleafheight = 0
for tree in tree_dict['trees']:
    print("TREE")
    for branch in  tree['branches']:
        print("\tBRANCH")
        for leaf in branch['leaves']:
            leafcount += 1
            (lxmin, lymin), (lxmax, lymax) = leaf['geometry']
            leafwidth = lxmax - lxmin
            leafheight = lymax - lymin
            print("\t\tLEAF: x1({}), y1({}), x2({}), y2({})\n\t\t\tWIDTH: {}\n\t\t\tHEIGHT: {}".format(lxmin, lymin, lxmax, lymax, leafwidth, leafheight))
            sumleafwidth += lxmax - lxmin
            sumleafheight += lymax - lymin
avgleafwidth = sumleafwidth / leafcount
avgleafheight = sumleafheight / leafcount
print("LEAVES\n\tCOUNT: {}\n\tAVERAGE WIDTH: {}\n\tAVERAGE HEIGHT: {}".format(leafcount, avgleafwidth, avgleafheight))
But is there a better way?
# psuedo code
leafcount = count(tree_dict['trees'][*]['branches'][*]['leaves'][*])
leaves = (tree_dict['trees'][*]['branches'][*]['leaves'][*])
sumleafwidth = sum(leaves[*]['geometry'][1][*]-leaves[*]['geometry'][0][*])
sumleafheight = sum(leaves[*]['geometry'][*][1]-leaves[*]['geometry'][*][0])
avgleafwidth = sumleafwidth / leafcount
avgleafheight = sumleafheight / leafcount
I think that even though python dict can be used as tree representation in most cases, if you want to work with even a little more advanced tree-related tasks, as mentioned above it is not the best data structure to use. There is a lot of implementations of tree-like structures in python, e.g. treelib. You can move from dict to Tree e.g. like that:
def dict_to_tree(data, parent_node=None, tree=None):
    if tree is None:
        tree = Tree()
    
    for key, value in data.items():
        if isinstance(value, dict):
            # Create a node for the key
            tree.create_node(tag=key, identifier=key, parent=parent_node)
            # Recursively call the function to process the sub-dictionary
            dict_to_tree(value, parent_node=key, tree=tree)
        else:
            # Create a node for the key and value
            tree.create_node(tag=f"{key}: {value}", identifier=key, parent=parent_node)
    return tree 
You should be able to resolve your problems in much easier and elegant way on the correct data structure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With