Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a list of dictionaries, how can I eliminate duplicates of one key, and sort by another

I'm working with a list of dict objects that looks like this (the order of the objects differs):

[
    {'name': 'Foo', 'score': 1},
    {'name': 'Bar', 'score': 2},
    {'name': 'Foo', 'score': 3},
    {'name': 'Bar', 'score': 3},
    {'name': 'Foo', 'score': 2},
    {'name': 'Baz', 'score': 2},
    {'name': 'Baz', 'score': 1},
    {'name': 'Bar', 'score': 1}
]

What I want to do is remove duplicate names, keeping only the one of each name that has the highest 'score'. The results from the above list would be:

[
    {'name': 'Baz', 'score': 2},
    {'name': 'Foo', 'score': 3},
    {'name': 'Bar', 'score': 3}
]

I'm not sure which pattern to use here (aside from a seemingly idiotic loop that keeps checking if the current dict's 'name' is in the list already and then checking if its 'score' is higher than the existing one's 'score'.

like image 812
orokusaki Avatar asked Feb 03 '12 04:02

orokusaki


People also ask

How do I remove duplicates from a dictionary list?

The strategy is to convert the list of dictionaries to a list of tuples where the tuples contain the items of the dictionary. Since the tuples can be hashed, you can remove duplicates using set (using a set comprehension here, older python alternative would be set(tuple(d.

How do I sort a list of dictionaries in Python?

To sort a list of dictionaries according to the value of the specific key, specify the key parameter of the sort() method or the sorted() function. By specifying a function to be applied to each element of the list, it is sorted according to the result of that function.

How do you remove duplicates from a key value pair in Python?

We can use loop or dictionary comprehension to remove duplicates from the dictionary in Python. While removing a duplicate value from the dictionary the keys are also removed in the process. If you don't care about retaining the original order then set(my_list) will remove all duplicates.


1 Answers

One way to do that is:

data = collections.defaultdict(list)
for i in my_list:
    data[i['name']].append(i['score'])
output = [{'name': i, 'score': max(j)} for i,j in data.items()]

so output will be:

[{'score': 2, 'name': 'Baz'},
 {'score': 3, 'name': 'Foo'},
 {'score': 3, 'name': 'Bar'}]
like image 106
JBernardo Avatar answered Nov 09 '22 22:11

JBernardo