Filtering a pandas df with any of the list values [duplicate]

Question

I have a pandas dataframe:

df
0       PL
1       PL
2       PL
3       IT
4       IT
        ..
4670    DE
4671    NO
4672    MT
4673    FI
4674    XX
Name: country_code, Length: 4675, dtype: object

I am filtering this by germany country tag 'DE' via:

df = df[df.apply(lambda x: 'DE' in x)]

If I would like to filter with more countries than I have to add them manually via: .apply(lambda x: 'DE' in x or 'GB' in x). However I would like to create a countries list and generate this statement automaticly.

Something like this:

countries = ['DE', 'GB', 'IT']
df = df[df.apply(lambda x: any_item_in_countries_list in x)]

I think I can filter df 3 times and then merge these pieces back via concat(), however is there a more generic function to achieve this?

Andreas · Accepted Answer

You can use .isin():

df[df['country_code'].isin(['DE', 'GB', 'IT'])]

Performance comparison:

import timeit
import pandas as pd
df = pd.DataFrame({'country_code': ['DE', 'GB', 'IT', 'MT', 'FI', 'XX'] * 1000})

%timeit df[df['country_code'].isin(['DE', 'GB', 'IT'])]
409 µs ± 19 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit df['country_code'].apply(lambda x: x in ['DE', 'AT', 'GB'])
1.35 ms ± 474 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Sabil · Answer

If you have column names the you can try this

countries = ['DE', 'GB', 'IT']
df[df['country_code'].isin(countries)]

Filtering a pandas df with any of the list values [duplicate]

Tags:

python

pandas

lambda

filter

oakca

2 Answers

Andreas

Sabil

Recent Activity

Donate For Us

Filtering a pandas df with any of the list values [duplicate]

Tags:

python

pandas

lambda

filter

oakca

2 Answers

Andreas

Sabil

Related questions

Recent Activity

Donate For Us