I have three lists X, Y, Z as follows:
X: [1, 1, 2, 3, 4, 5, 5, 5]
Y: [3, 3, 2, 6, 7, 1, 1, 2]
Z: [0, 0, 1, 1, 2, 3, 3, 4]
I am trying to remove both duplicated set of values at the same index of the lists get a reduced list as follows, all three list will always have the same length initially and at the end as well:
X: [2, 3, 4, 5]
Y: [2, 6, 7, 2]
Z: [1, 1, 2, 4]
I tried using the zip(X, Y, Z) function but I can't index it and the dict.fromkeys only removes one of the duplicates and leaves the other in the new list. I want to be able to remove both.
Any help is appreciated!
Using collections.Counter
and zip
, you can count unique triplets.
Then remove duplicates via a generator comprehension.
from collections import Counter
X = [1, 1, 2, 3, 4, 5, 5, 5]
Y = [3, 3, 2, 6, 7, 1, 1, 2]
Z = [0, 0, 1, 1, 2, 3, 3, 4]
c = Counter(zip(X, Y, Z))
X, Y, Z = zip(*(k for k, v in c.items() if v == 1))
print(X, Y, Z, sep='\n')
(2, 3, 4, 5)
(2, 6, 7, 2)
(1, 1, 2, 4)
Note if ordering is a requirement and you are not using Python 3.6+, you can create an "OrderedCounter" instead by subclassing collections.OrderedDict
.
It's convenient to use pandas library for the task. Just create dataframe using the lists and apply df.drop_duplicates
with keep=False
(means remove all duplicated rows):
import pandas as pd
dct = {
"X": [1, 1, 2, 3, 4, 5, 5, 5],
"Y": [3, 3, 2, 6, 7, 1, 1, 2],
"Z": [0, 0, 1, 1, 2, 3, 3, 4],
}
d = pd.DataFrame(dct)
d.drop_duplicates(keep=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With