I would like to remove a certain number of duplicates of a list without removing all of them. For example, I have a list [1,2,3,4,4,4,4,4]
and I want to remove 3 of the 4's, so that I am left with [1,2,3,4,4]
. A naive way to do it would probably be
def remove_n_duplicates(remove_from, what, how_many):
for j in range(how_many):
remove_from.remove(what)
Is there a way to do remove the three 4's in one pass through the list, but keep the other two.
If you just want to remove the first n
occurrences of something from a list, this is pretty easy to do with a generator:
def remove_n_dupes(remove_from, what, how_many):
count = 0
for item in remove_from:
if item == what and count < how_many:
count += 1
else:
yield item
Usage looks like:
lst = [1,2,3,4,4,4,4,4]
print list(remove_n_dupes(lst, 4, 3)) # [1, 2, 3, 4, 4]
Keeping a specified number of duplicates of any item is similarly easy if we use a little extra auxiliary storage:
from collections import Counter
def keep_n_dupes(remove_from, how_many):
counts = Counter()
for item in remove_from:
counts[item] += 1
if counts[item] <= how_many:
yield item
Usage is similar:
lst = [1,1,1,1,2,3,4,4,4,4,4]
print list(keep_n_dupes(lst, 2)) # [1, 1, 2, 3, 4, 4]
Here the input is the list and the max number of items that you want to keep. The caveat is that the items need to be hashable...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With