I have a list of objects with multiple attributes. I want to filter the list based on one attribute of the object (country_code), i.e.
Current list
elems = [{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc': 'Romania', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc':'Romania', 'event_number': '3200'},
{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc': 'Romania', 'event_number': '4000'},
{'region_code': 'EUD', 'country_code': 'SVN', 'country_desc': 'Slovenia', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'NLD', 'country_desc':'Netherlands', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'BEL', 'country_desc':'Belgium', 'event_number': '6880'}]
Desired list
elems = [{'region_code': 'EUD', 'country_code': 'ROM', 'country_desc': 'Romania', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'SVN', 'country_desc': 'Slovenia', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'NLD', 'country_desc': 'Netherlands', 'event_number': '6880'},
{'region_code': 'EUD', 'country_code': 'BEL', 'country_desc': 'Belgium', 'event_number': '6880'}]
I can achieve this by creating a dictionary and a for-loop, but I feel like there's an easier way in python using the filter() or reduce() functions, I just can't figure out how.
Can anyone simplify the below code using in-built python functions? Performance is a big factor because the real data will be substantial.
Working code:
unique = {}
for elem in elems:
if elem['country_code'] not in unique.keys():
unique[elem['country_code']] = elem
print(unique.values())
Worth noting I have also tried the code below, but it performs worse than the current working code:
unique = []
for elem in elems:
if not any(u['country_code'] == elem['country_code'] for u in unique):
unique.append(elem)
I think your first approach is already pretty close to being optimal. Dictionary lookup is fast (just as fast as in a set) and the loop is easy to understand, even though a bit lengthy (by Python standards), but you should not sacrifice readability for brevity.
You can, however, shave off one line using setdefault, and you might want to use collections.OrderedDict() so that the elements in the resulting list are in their orginal order. Also, note that in Python 3, unique.values() is not a list but a view on the dict.
unique = collections.OrderedDict()
for elem in elems:
unique.setdefault(elem["country_code"], elem)
If you really, really want to use reduce, you can use the empty dict as an initializer and then use d.setdefault(k,v) and d to set the value (if not present) and return the modified dict.
unique = reduce(lambda unique, elem: unique.setdefault(elem["country_code"], elem) and unique,
elems, collections.OrderedDict())
I would just use the loop, though.
I think that your approach is just fine. It would be slightly better to check elem['country_code'] not in unique instead of elem['country_code'] not in unique.keys().
However, here is another way to do it with a list comprehension:
visited = set()
res = [e for e in elems
if e['country_code'] not in visited
and not visited.add(e['country_code'])]
The last bit abuses the fact that not None == True and list.add returns None.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With