Let's assume I have a list, structured like this with approx 1 million elements:
a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]
What is the fastest way to remove all elements from a
that have the same value at index 0?
The result should be
b = [["a","a"],["b","a"],["c","a"],["d","a"]]
Is there a faster way than this:
processed = []
no_duplicates = []
for elem in a:
if elem[0] not in processed:
no_duplicates.append(elem)
processed.append(elem[0])
This works but the appending operations take ages.
you can use set
to keep the record of first element and check if for each sublist first element in this or not. it will took O(1) time compare to O(n) time to your solution to search.
>>> a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]
>>>
>>> seen = set()
>>> new_a = []
>>> for i in a:
... if i[0] not in seen:
... new_a.append(i)
... seen.add(i[0])
...
>>> new_a
[['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]
>>>
Space complexity : O(N) Time complexity: O(N) Search if first element there or not : O(1)
In case, no new list to be declared, then use del
element, but this will increase time complexity
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With