I want to delete duplicated dictionary objects from a List of dictionaries. I don't want the dict element that has the same 'plate' element with another dict element in the list. I want it only once.
datalist = [
{
'plate': "01",
'confidence' : "80"
},
{
'plate': "01",
'confidence' : "60"
},
{
'plate': "02",
'confidence' : "91"
},
{
'plate': "02",
'confidence' : "91"
},
]
My output should be like this:
datalist = [
{
'plate': "01",
'confidence' : "80"
},
{
'plate': "02",
'confidence' : "91"
},
]
This is my code, but I'm not getting the exact result.
def filter(datalist):
previous = ""
for data in datalist:
current = data['plate']
if current is previous:
datalist.remove(data)
previous = current
return datalist
datalist = [
{
'plate': "01",
'confidence' : "80"
},
{
'plate': "01",
'confidence' : "60"
},
{
'plate': "02",
'confidence' : "91"
},
{
'plate': "02",
'confidence' : "91"
},
]
print (filter(datalist))
This gives me the output:
[
{
'plate': "01",
'confidence' : "80"
},
{
'plate': "02",
'confidence' : "91"
},
{
'plate': "02",
'confidence' : "91"
},
]
which is not expected, what's wrong with my code.
The Key value of a Dictionary is unique and doesn't let you add a duplicate key entry.
You can remove duplicates from a Python using the dict. fromkeys(), which generates a dictionary that removes any duplicate values. You can also convert a list to a set. You must convert the dictionary or set back into a list to see a list whose duplicates have been removed.
If any element from the groups of duplicates is acceptable, you could do:
datalist = [
{'plate': "01", 'confidence': "80"},
{'plate': "01", 'confidence': "60"},
{'plate': "02", 'confidence': "91"},
{'plate': "02", 'confidence': "91"},
]
result = list({ d['plate'] : d for d in datalist }.values())
print(result)
Output
[{'plate': '02', 'confidence': '91'}, {'plate': '01', 'confidence': '60'}]
The idea is to create a dictionary where the keys are values of plate
and the values are the dictionaries themselves. If you want to keep the first duplicate entries use reversed:
result = list({d['plate']: d for d in reversed(datalist)}.values())
Output
[{'plate': '02', 'confidence': '91'}, {'plate': '01', 'confidence': '80'}]
You can use the unique_everseen
recipe, also available in 3rd party more_itertools
:
from more_itertools import unique_everseen
from operator import itemgetter
datalist = list(unique_everseen(datalist, key=itemgetter('plate')))
Internally, this solution uses set
to keep track of seen plates, yielding only dictionaries with new plate values. Therefore, ordering is maintained and only the first instance of any given plate is kept.
you can also use pandas
import pandas as pd
df = pd.DataFrame(data = datalist)
df.drop_duplicates(subset = ['plate'],keep='first',inplace=True)
output = df.to_dict(orient='record')
keep = 'first' or 'last' will help in which entry to keep in output
If you are a pandas
user, you can consider
>>> import pandas as pd
>>> datalist = [{'plate': "01", 'confidence': "80"}, {'plate': "01", 'confidence': "60"}, {'plate': "02", 'confidence': "91"}, {'plate': "02", 'confidence': "91"}]
>>> pd.DataFrame(datalist).drop_duplicates('plate').to_dict(orient='records')
[{'confidence': '80', 'plate': '01'}, {'confidence': '91', 'plate': '02'}]
If you want to keep the last seen duplicates, pass keep='last'
.
>>> pd.DataFrame(datalist).drop_duplicates('plate', keep='last').to_dict(orient='records')
[{'confidence': '60', 'plate': '01'}, {'confidence': '91', 'plate': '02'}]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With