Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicate JSON objects from list in python

I have a list of dict where a particular value is repeated multiple times, and I would like to remove the duplicate values.

My list:

te = [
      {
        "Name": "Bala",
        "phone": "None"
      },
      {
        "Name": "Bala",
        "phone": "None"
      },
      {
        "Name": "Bala",
        "phone": "None"
      },
      {
        "Name": "Bala",
        "phone": "None"
      }
    ]

function to remove duplicate values:

def removeduplicate(it):
    seen = set()
    for x in it:
        if x not in seen:
            yield x
            seen.add(x)

When I call this function I get generator object.

<generator object removeduplicate at 0x0170B6E8>

When I try to iterate over the generator I get TypeError: unhashable type: 'dict'

Is there a way to remove the duplicate values or to iterate over the generator

like image 324
Tony Roczz Avatar asked Nov 27 '15 10:11

Tony Roczz


People also ask

How do I remove duplicates from a list in JSON Python?

The method unique() from Numpy module can help us remove duplicate from the list given. The Pandas module has a unique() method that will give us the unique elements from the list given. The combination of list comprehension and enumerate is used to remove the duplicate elements from the list.

How do you remove duplicates from a list in Python?

Create a dictionary, using the List items as keys. This will automatically remove any duplicates because dictionaries cannot have duplicate keys.

How do I remove duplicate tuples from a list in Python?

To remove duplicate tuples from a list of tuples: Use the set() class to convert the list to a set of tuples. Any duplicate tuples will automatically get removed after the conversion. Use the list() class to convert the set back to a list.


2 Answers

You can easily remove duplicate keys by dictionary comprehension, since dictionary does not allow duplicate keys, as below-

te = [
      {
        "Name": "Bala",
        "phone": "None"
      },
      {
        "Name": "Bala",
        "phone": "None"
      },
      {
        "Name": "Bala",
        "phone": "None"
      },
      {
        "Name": "Bala",
        "phone": "None"
      },
      {
          "Name": "Bala1",
          "phone": "None"
      }      
    ]

unique = { each['Name'] : each for each in te }.values()

print unique

Output-

[{'phone': 'None', 'Name': 'Bala1'}, {'phone': 'None', 'Name': 'Bala'}]
like image 86
SIslam Avatar answered Oct 14 '22 21:10

SIslam


Because you can't add a dict to set. From this question:

You're trying to use a dict as a key to another dict or in a set. That does not work because the keys have to be hashable.

As a general rule, only immutable objects (strings, integers, floats, frozensets, tuples of immutables) are hashable (though exceptions are possible).

>>> foo = dict()
>>> bar = set()
>>> bar.add(foo)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> 

Instead, you're already using if x not in seen, so just use a list:

>>> te = [
...       {
...         "Name": "Bala",
...         "phone": "None"
...       },
...       {
...         "Name": "Bala",
...         "phone": "None"
...       },
...       {
...         "Name": "Bala",
...         "phone": "None"
...       },
...       {
...         "Name": "Bala",
...         "phone": "None"
...       }
...     ]

>>> def removeduplicate(it):
...     seen = []
...     for x in it:
...         if x not in seen:
...             yield x
...             seen.append(x)

>>> removeduplicate(te)
<generator object removeduplicate at 0x7f3578c71ca8>

>>> list(removeduplicate(te))
[{'phone': 'None', 'Name': 'Bala'}]
>>> 
like image 30
Remi Crystal Avatar answered Oct 14 '22 20:10

Remi Crystal