I have a list:
df = [['apple', 'red', '0.2'], ['apple', 'green', '8.9'], ['apple', 'brown', '2.9'],
['guava', 'green', '1.9'], ['guava', 'yellow', '4.9'], ['guava', 'light green', '2.3']]
From here I want to only get the first 2 items from the first distinct sublist given the condition that the value of the first sublist is unique.
Expected output:
df = [['apple', 'red'], ['guava', 'green']]
Code till now:
dummy_list = []
for item in df:
if item[0] not in dummy_list:
dummy_list.append(item[:2])
This is not working and appending all the elements. Any help on this please
Or smarter : use a dict and setdefault
to add the mapping only for the first
result = {}
for value in df:
result.setdefault(value[0], value[:2])
result = list(result.values())
print(result)
Or you could keep a count of the added keys to avoid repeating them (in a separate list)
keys = set()
result = []
for value in df:
if value[0] not in keys:
result.append(value[:2])
keys.add(value[0])
print(result) # [['apple', 'red'], ['guava', 'green']]
You can use itertools.groupby
and for the key use operator.itemgetter
:
from itertools import groupby
from operator import itemgetter
df = [['apple', 'red', '0.2'], ['apple', 'green', '8.9'], ['apple', 'brown', '2.9'],
['guava', 'green', '1.9'], ['guava', 'yellow', '4.9'], ['guava', 'light green', '2.3']]
df1 = [next(g)[:2] for k, g in groupby(df, key=itemgetter(0))]
FYI itemgetter(0)
is the same as lambda x: x[0]
so you could use that too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With