Removing duplicates using custom comparisons

Question

The most convenient, "Pythonic" way to remove duplicates from a list is basically:

mylist = list(set(mylist))

But suppose your criteria for counting a duplicate depends on a particular member field of the objects contained in mylist.

Well, one solution is to just define __eq__ and __hash__ for the objects in mylist, and then the classic list(set(mylist)) will work.

But sometimes you have requirements that call for a bit more flexibility. It would be very convenient to be able to create on-the-fly lambdas to use custom comparison routines to identify duplicates in different ways. Ideally, something like:

mylist = list(set(mylist, key = lambda x: x.firstname))

Of course, that doesn't actually work because the set constructor doesn't take a compare function, and set requires hashable keys as well.

So what's the closest way to achieve something like that, so that you can remove duplicates using arbitrary comparison functions?

interjay · Accepted Answer

You can use a dict instead of a set, where the dict's keys will be the unique values:

d = {x.firstname: x for x in mylist}
mylist = list(d.values())

Tim Pietzcker · Answer

I would do this:

duplicates = set()
newlist = []
for item in mylist:
    if item.firstname not in duplicates:
        newlist.append(item)
        excludes.add(item.firstname)

Removing duplicates using custom comparisons

Tags:

python

python-3.x

Channel72

2 Answers

interjay

Tim Pietzcker

Recent Activity

Donate For Us

Removing duplicates using custom comparisons

Tags:

python

python-3.x

Channel72

2 Answers

interjay

Tim Pietzcker

Related questions

Recent Activity

Donate For Us