Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete duplicated dictionary objects from a List of dictionaries

I want to delete duplicated dictionary objects from a List of dictionaries. I don't want the dict element that has the same 'plate' element with another dict element in the list. I want it only once.

datalist = [

{
    'plate': "01",
    'confidence' : "80"
},

{
    'plate': "01",
    'confidence' : "60"
},

{
    'plate': "02",
    'confidence' : "91"
},

{
    'plate': "02",
    'confidence' : "91"
},
]

My output should be like this:

datalist = [

{
    'plate': "01",
    'confidence' : "80"
},

{
    'plate': "02",
    'confidence' : "91"
},
]

This is my code, but I'm not getting the exact result.

def filter(datalist):
    previous = ""
    for data in datalist:
        current  = data['plate']
        if current is previous:
            datalist.remove(data)
        previous = current 

    return datalist

datalist = [

    {
        'plate': "01",
        'confidence' : "80"
    },

    {
        'plate': "01",
        'confidence' : "60"
    },

    {
        'plate': "02",
        'confidence' : "91"
    },

    {
        'plate': "02",
        'confidence' : "91"
    },
]


print (filter(datalist))

This gives me the output:

[

    {
        'plate': "01",
        'confidence' : "80"
    },

    {
        'plate': "02",
        'confidence' : "91"
    },

    {
        'plate': "02",
        'confidence' : "91"
    },
]

which is not expected, what's wrong with my code.

like image 544
Khaalidi Avatar asked Jan 04 '19 12:01

Khaalidi


People also ask

Can dictionary have duplicate items?

The Key value of a Dictionary is unique and doesn't let you add a duplicate key entry.

Does dictionary remove duplicates Python?

You can remove duplicates from a Python using the dict. fromkeys(), which generates a dictionary that removes any duplicate values. You can also convert a list to a set. You must convert the dictionary or set back into a list to see a list whose duplicates have been removed.


4 Answers

If any element from the groups of duplicates is acceptable, you could do:

datalist = [
    {'plate': "01", 'confidence': "80"},
    {'plate': "01", 'confidence': "60"},
    {'plate': "02", 'confidence': "91"},
    {'plate': "02", 'confidence': "91"},
]

result = list({ d['plate'] : d for d in datalist }.values())
print(result)

Output

[{'plate': '02', 'confidence': '91'}, {'plate': '01', 'confidence': '60'}]

The idea is to create a dictionary where the keys are values of plate and the values are the dictionaries themselves. If you want to keep the first duplicate entries use reversed:

result = list({d['plate']: d for d in reversed(datalist)}.values())

Output

[{'plate': '02', 'confidence': '91'}, {'plate': '01', 'confidence': '80'}]
like image 199
Dani Mesejo Avatar answered Dec 12 '22 04:12

Dani Mesejo


You can use the unique_everseen recipe, also available in 3rd party more_itertools:

from more_itertools import unique_everseen
from operator import itemgetter    

datalist = list(unique_everseen(datalist, key=itemgetter('plate')))

Internally, this solution uses set to keep track of seen plates, yielding only dictionaries with new plate values. Therefore, ordering is maintained and only the first instance of any given plate is kept.

like image 41
jpp Avatar answered Dec 12 '22 05:12

jpp


you can also use pandas

import pandas as pd
df = pd.DataFrame(data = datalist)
df.drop_duplicates(subset = ['plate'],keep='first',inplace=True)
output = df.to_dict(orient='record')

keep = 'first' or 'last' will help in which entry to keep in output

like image 32
LMSharma Avatar answered Dec 12 '22 03:12

LMSharma


If you are a pandas user, you can consider

>>> import pandas as pd
>>> datalist = [{'plate': "01", 'confidence': "80"}, {'plate': "01", 'confidence': "60"}, {'plate': "02", 'confidence': "91"}, {'plate': "02", 'confidence': "91"}]
>>> pd.DataFrame(datalist).drop_duplicates('plate').to_dict(orient='records')                                                                               
[{'confidence': '80', 'plate': '01'}, {'confidence': '91', 'plate': '02'}]

If you want to keep the last seen duplicates, pass keep='last'.

>>> pd.DataFrame(datalist).drop_duplicates('plate', keep='last').to_dict(orient='records')
[{'confidence': '60', 'plate': '01'}, {'confidence': '91', 'plate': '02'}]
like image 21
timgeb Avatar answered Dec 12 '22 03:12

timgeb