Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - List of unique dictionaries

People also ask

How do I get a list of unique values from a dictionary?

We can use the dict. fromkeys method of the dict class to get unique values from a Python list. This method preserves the original order of the elements and keeps only the first element from the duplicates.

Can Python have a list of dictionaries?

In Python, we can have a list or an array of dictionaries. In such an object, every element of a list is a dictionary. Every dictionary can be accessed using its index.

What is a list of dictionaries Python?

Lists are mutable data types in Python. Lists is a 0 based index datatype meaning the index of the first element starts at 0. Lists are used to store multiple items in a single variable. Lists are one of the 4 data types present in Python i.e. Lists, Dictionary, Tuple & Set.


So make a temporary dict with the key being the id. This filters out the duplicates. The values() of the dict will be the list

In Python2.7

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ]
>>> {v['id']:v for v in L}.values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

In Python3

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ] 
>>> list({v['id']:v for v in L}.values())
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

In Python2.5/2.6

>>> L=[
... {'id':1,'name':'john', 'age':34},
... {'id':1,'name':'john', 'age':34},
... {'id':2,'name':'hanna', 'age':30},
... ] 
>>> dict((v['id'],v) for v in L).values()
[{'age': 34, 'id': 1, 'name': 'john'}, {'age': 30, 'id': 2, 'name': 'hanna'}]

The usual way to find just the common elements in a set is to use Python's set class. Just add all the elements to the set, then convert the set to a list, and bam the duplicates are gone.

The problem, of course, is that a set() can only contain hashable entries, and a dict is not hashable.

If I had this problem, my solution would be to convert each dict into a string that represents the dict, then add all the strings to a set() then read out the string values as a list() and convert back to dict.

A good representation of a dict in string form is JSON format. And Python has a built-in module for JSON (called json of course).

The remaining problem is that the elements in a dict are not ordered, and when Python converts the dict to a JSON string, you might get two JSON strings that represent equivalent dictionaries but are not identical strings. The easy solution is to pass the argument sort_keys=True when you call json.dumps().

EDIT: This solution was assuming that a given dict could have any part different. If we can assume that every dict with the same "id" value will match every other dict with the same "id" value, then this is overkill; @gnibbler's solution would be faster and easier.

EDIT: Now there is a comment from André Lima explicitly saying that if the ID is a duplicate, it's safe to assume that the whole dict is a duplicate. So this answer is overkill and I recommend @gnibbler's answer.


In case the dictionaries are only uniquely identified by all items (ID is not available) you can use the answer using JSON. The following is an alternative that does not use JSON, and will work as long as all dictionary values are immutable

[dict(s) for s in set(frozenset(d.items()) for d in L)]

Here's a reasonably compact solution, though I suspect not particularly efficient (to put it mildly):

>>> ds = [{'id':1,'name':'john', 'age':34},
...       {'id':1,'name':'john', 'age':34},
...       {'id':2,'name':'hanna', 'age':30}
...       ]
>>> map(dict, set(tuple(sorted(d.items())) for d in ds))
[{'age': 30, 'id': 2, 'name': 'hanna'}, {'age': 34, 'id': 1, 'name': 'john'}]

You can use numpy library (works for Python2.x only):

   import numpy as np 

   list_of_unique_dicts=list(np.unique(np.array(list_of_dicts)))

To get it worked with Python 3.x (and recent versions of numpy), you need to convert array of dicts to numpy array of strings, e.g.

list_of_unique_dicts=list(np.unique(np.array(list_of_dicts).astype(str)))