Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to find most commonly occurring dictionary inside a list of dictionaries

I have a list of dictionaries where each dictionary has keys 'shape' and 'colour'. For example:

info = [
    {'shape': 'pentagon', 'colour': 'red'},
    {'shape': 'rectangle', 'colour': 'white'},
    # etc etc
]

I need to find the most commonly occurring shape/colour combination. I decided to do this by finding the most commonly occurring dictionary in the list. I've cut my method down to this:

frequency = defaultdict(int)

for i in info:
    hashed = json.dumps(i) # Get dictionary, turn into string to store as key in frequency dict
    frequency[hashed] += 1

most_common = max(frequency, key = frequency.get) # Get most common key and unhash it back into dict
print(json.loads(most_common))

I'm kind of new to python and I always end up finding out about some 1-2 line function that ends up doing what I wanted to do in the first place. I was wondering if a quicker method existed in this case? Maybe this can end up helping another beginner out because I couldn't find anything after ages of googling.

like image 620
Jamesyyy Avatar asked Jan 01 '23 13:01

Jamesyyy


1 Answers

If the items in a list have consistent keys a better option is to use a namedtuple in place of a dict eg:

from collections import namedtuple

# Define the named tuple
MyItem = namedtuple("MyItem", "shape colour")

# Create your list of data
info = [
    MyItem('pentagon', 'red'),
    MyItem('rectangle', 'white'),
    # etc etc
]

This provides a number of benefits:

# To instantiate
item = MyItem("pentagon", "red")

# or using keyword arguments
item = MyItem(shape="pentagon", colour="red")

# or from your existing dict
item = MyItem(**{'shape': 'pentagon', 'colour': 'red'})

# Accessors
print(item.shape)
print(item.colour)

# Decomposition
shape, colour = item

However, back to the question of counting matching items, because a namedtuple is hashable collections.Counter can be used and the counting code then becomes:

from collections import Counter

frequency = Counter(info)

# Get the items in the order of most common
frequency.most_common()

Enjoy!

like image 90
Tim Avatar answered Feb 05 '23 10:02

Tim