I have a list of objects, and I want to filter the list in a way that as a result there is only one occurence of each attribute value.
For instance, let's say I have three objects
obj1.my_attr = 'a'
obj2.my_attr = 'b'
obj3.my_attr = 'b'
obj_list = [obj1, obj2, obj3]
And and the end, I want to get [obj1, obj2]
. Actually order does not matter, so [obj1, obj3]
is exactly as good.
First I thought of the typical imperative clunky ways like following:
record = set()
result = []
for obj in obj_list:
if obj.my_attr not in record:
record.add(obj.my_attr)
result.append(obj)
Then I though of maping it to a dictionary, use the key to override any previous entry and finally extract the values:
result = {obj.my_attr: obj for obj in obj_list}.values()
This one looks good, but I would like to know if there any more elegant, efficient or functional way of achieving this. Maybe some sweet thing hidden in the standard library... Thanks in advance.
If you want to use a functional programming style in Python, you may want to check out the toolz package. With toolz
, you could simply do:
toolz.unique(obj_list, key=lambda x: x.my_attr)
For better performance, you could use operator.attrgetter('my_attr')
instead of the lambda function for the key. You could also use cytoolz, which is a fast implementation of toolz
written in Cython.
You could use an object that would define a custom __hash__
function:
class HashMyAttr:
def __init__(self, obj):
self.obj = obj
def __hash__(self):
return self.obj.my_attr.__hash__()
def __eq__(self, other):
return self.obj.my_attr == other.obj.my_attr
And use it like:
obj_list = [x.obj for x in set(HashMyAttr(obj) for obj in obj_list)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With