I have a DataFrame
like the following:
import numpy as np
import pandas as pd
import string
import random
random.seed(42)
df = pd.DataFrame({'col1': list(string.ascii_lowercase)[:11],
'col2':[random.randint(1,100) for x in range(11)]})
df
col1 col2
0 a 64
1 b 3
2 c 28
3 d 23
4 e 74
5 f 68
6 g 90
7 h 9
8 i 43
9 j 3
10 k 22
I'm trying to create a new dataframe based on the filtering the rows of the previous dataframe that match a list of values. I have tried the next piece of code:
df_filt = df[df['col1'] in ['a','c','h']]
But I get an error. I'm expecting the next result:
df_filt
col1 col2
0 a 64
1 c 28
2 h 9
I'm looking for a flexible solution that allows to filter based on more elements of the matching list than the ones presented in the example.
to filter one column by multiple values. df. loc[df['channel']. apply(lambda x: x in ['sale','fullprice'])] would also work.
You can use pandas.Series.isin
for compound "in"-checks.
Input dataframe:
>>> df
>>>
col1 col2
0 a 64
1 b 3
2 c 28
3 d 23
4 e 74
5 f 68
6 g 90
7 h 9
8 i 43
9 j 3
10 k 22
Output dataframe:
>>> df[df['col1'].isin(['a', 'c', 'h'])]
>>>
col1 col2
0 a 64
2 c 28
7 h 9
Use isin
df_filt = df[df.col1.isin(['a','c','h'])]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With