I have a list of objects with multiple attributes. I want to filter the list based on one attribute of the object (country_code), i.e.
Current list
elems = [{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc': 'Romania', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc':'Romania', 'event_number': '3200'},
{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc': 'Romania', 'event_number': '4000'},
{'region_code': 'EUD', 'country_code': 'SVN', 'country_desc': 'Slovenia', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'NLD', 'country_desc':'Netherlands', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'BEL', 'country_desc':'Belgium', 'event_number': '6880'}]
Desired list
elems = [{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc': 'Romania', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'SVN', 'country_desc': 'Slovenia', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'NLD', 'country_desc': 'Netherlands', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'BEL', 'country_desc': 'Belgium', 'event_number': '6880'}]
I can achieve this by creating a dictionary and a for-loop, but I feel like there's an easier way in python using the filter() or reduce() functions, I just can't figure out how.
Can anyone simplify the below code using in-built python functions? Performance is a big factor because the real data will be substantial.
Working code:
unique = {}
for elem in elems:
if elem['country_code'] not in unique.keys():
unique[elem['country_code']] = elem
print(unique.values())
Worth noting I have also tried the code below, but it performs worse than the current working code:
unique = []
for elem in elems:
if not any(u['country_code'] == elem['country_code'] for u in unique):
unique.append(elem)
I think your first approach is already pretty close to being optimal. Dictionary lookup is fast (just as fast as in a set
) and the loop is easy to understand, even though a bit lengthy (by Python standards), but you should not sacrifice readability for brevity.
You can, however, shave off one line using setdefault
, and you might want to use collections.OrderedDict()
so that the elements in the resulting list are in their orginal order. Also, note that in Python 3, unique.values()
is not a list but a view on the dict.
unique = collections.OrderedDict()
for elem in elems:
unique.setdefault(elem["country_code"], elem)
If you really, really want to use reduce
, you can use the empty dict as an initializer and then use d.setdefault(k,v) and d
to set the value (if not present) and return the modified dict.
unique = reduce(lambda unique, elem: unique.setdefault(elem["country_code"], elem) and unique,
elems, collections.OrderedDict())
I would just use the loop, though.
I think that your approach is just fine. It would be slightly better to check elem['country_code'] not in unique
instead of elem['country_code'] not in unique.keys()
.
However, here is another way to do it with a list comprehension:
visited = set()
res = [e for e in elems
if e['country_code'] not in visited
and not visited.add(e['country_code'])]
The last bit abuses the fact that not None == True
and list.add
returns None
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With