I want a code that deletes all instances of any number that has been repeated from a list.
E.g.:
Inputlist = [2, 3, 6, 6, 8, 9, 12, 12, 14]
Outputlist = [2,3,8,9,14]
I have tried to remove the duplicated elements in the list already (by using the "unique" function), but it leaves a single instance of the element in the list nevertheless!
seen = set()
uniq = []
for x in Outputlist:
if x not in seen:
uniq.append(x)
seen.add(x)
seen
I went through a lot of StackOverflow articles too, but all of them differ in the idea that they are searching for removing common elements from two different lists, or that they want just one instance of each element to still be kept. I want to simply remove all common elements.
To remove the duplicates from a list, you can make use of the built-in function set(). The specialty of the set() method is that it returns distinct elements.
How to Remove Repeated Values in Excel 1 Select the list with the repeated values you want to eliminate, then click Kutools > Select > Select Duplicate &... 2 In the Select Duplicate & Unique Cells dialog box, if you only want to remove duplicates except the first one, please... See More....
The task is to perform the operation of removing all the occurrences of a given item/element present in a list. Explanation : The input list is [1, 1, 2, 3, 4, 5, 1, 2] and the item to be removed is 1. In this article, we shall see how to execute this task in 3 ways :
Delete all occurrences of a given key in a linked list. Given a singly linked list, delete all occurrences of a given key in it. For example, consider the following list. Input: 2 -> 2 -> 1 -> 8 -> 2 -> 3 -> 2 -> 7 Key to delete = 2 Output: 1 -> 8 -> 3 -> 7.
Calling List.remove () shifts all elements after the removed one to smaller indices. In this scenario, it means that we delete all elements, except the first. When only the first remains, the index 1 will be illegal. Hence we get an Exception.
You can use a Counter
>>> from collections import Counter
>>> l = [2, 3, 6, 6, 8, 9, 12, 12, 14]
>>> res = [el for el, cnt in Counter(l).items() if cnt==1]
>>> res
[2, 3, 8, 9, 14]
You can always have two sets. One to check if seen
and another one to keep unique only. set.discard(el)
will remove if exists.
Inputlist = [2, 3, 6, 6, 8, 9, 12, 12, 14]
seen = set()
ans = set()
for el in Inputlist:
if el not in seen:
seen.add(el)
ans.add(el)
else:
ans.discard(el)
print(list(ans))
EDIT: for giggles I measured the performance of these two solutions
from timeit import timeit
first = """
def get_from_two_sets():
seen = set()
ans = set()
for el in (2, 3, 6, 6, 8, 9, 12, 12, 14):
if el not in seen:
seen.add(el)
ans.add(el)
else:
ans.discard(el)"""
second = """
def get_from_counter():
return [el for el, cnt in Counter((2, 3, 6, 6, 8, 9, 12, 12, 14)).items() if cnt == 1]
"""
print(timeit(stmt=first, number=10000000))
print(timeit(stmt=second, number=10000000, setup="from collections import Counter"))
yields
0.3130729760000577
0.46127468299982866
so yay! it seems like my solution is slightly faster. Don't waste those nanoseconds you saved!
@abc solution is clean and pythonic, go for it.
A simple list comprehension will do the trick:
Inputlist = [2, 3, 6, 6, 8, 9, 12, 12, 14]
Outputlist = [item for item in Inputlist if Inputlist.count(item) == 1]
Alternate solution for case where only consecutive duplicates should be removed:
from itertools import groupby
inputlist = [2, 3, 6, 6, 8, 9, 12, 12, 14]
outputlist = [x for _, (x, *extra) in groupby(inputlist) if not extra]
All this does is group together runs of identical values, unpack the first copy into x
, and the rest enter a list
; we check if said list
is empty to determine whether there was just one value, or more than one, and only keep the ones where it was a single value.
If you don't like even the temporary extra
list
, using one of the ilen
solutions that doesn't list
ify the group would allow a similar solution with no unbounded temporary storage:
outputlist = [x for x, grp in groupby(inputlist) if ilen(grp) == 1]
Or with a helper that just checks "at least 2" without iterating beyond that point:
def more_than_one(it):
next(it) # Assumes at least once, which is already the case with groupby groups
try:
next(it)
except StopIteration:
return True
return False
outputlist = [x for x, grp in groupby(inputlist) if not more_than_one(grp)]
Note: I'd actually prefer abc's Counter
-based solution in general, but if you actually want to only delete adjacent duplicates, it's not adequate to the task.
Another solution using sets: Convert the input list to a set and remove all elements of this set from the input list. This leaves only duplicates in the list. Now convert this to a set and you can subtract one set from another. Sounds complicated, but is quite short and efficient for short lists:
l = [2, 3, 6, 6, 8, 9, 12, 12, 14]
inset = set(l)
for i in inset: # <-- usually the element to remove is in the front,
l.remove(i) # <-- but in a worst case, this is slower than O(n)
result = list(inset - set(l))
irrelevant performance for the short example list:
# %timeit this solution
1.18 µs ± 1.97 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
# %timeit solution with seen-set
1.23 µs ± 1.49 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
# %timeit solution with Counter class
2.76 µs ± 4.85 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
For a list with 1000 elements and 10% duplicates the Counter-solution is fastest!
If input is sorted and can be bounded by a min and a max, this can be done in O(n):
min = -1
max = 99999999 # put whatever you need
J = [min] + I + [max]
[y for (x,y,z) in zip(J, J[1:], J[2:]) if x < y and y < z]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With