I have a pandas
DataFrame
df
:
import pandas as pd data = {"Name": ["AAAA", "BBBB"], "C1": [25, 12], "C2": [2, 1], "C3": [1, 10]} df = pd.DataFrame(data) df.set_index("Name")
which looks like this when printed (for reference):
C1 C2 C3 Name AAAA 25 2 1 BBBB 12 1 10
I would like to choose rows for which C1
, C2
and C3
have values between 0
and 20
.
Can you suggest an elegant way to select those rows?
You can select the Rows from Pandas DataFrame based on column values or based on multiple conditions either using DataFrame. loc[] attribute, DataFrame. query() or DataFrame. apply() method to use lambda function.
You can get pandas. Series of bool which is an AND of two conditions using & . Note that == and ~ are used here as the second condition for the sake of explanation, but you can use !=
To filter pandas DataFrame by multiple columns. When we filter a DataFrame by one column, we simply compare that column values against a specific condition but when it comes to filtering of DataFrame by multiple columns, we need to use the AND (&&) Operator to match multiple columns with multiple conditions.
I think below should do it, but its elegance is up for debate.
new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]
Shorter version:
In [65]: df[(df>=0)&(df<=20)].dropna() Out[65]: Name C1 C2 C3 1 BBBB 12 1 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With