I have a pandas dataframe and want to select the rows where some columns have some specific value. For example, for one column I tried this:
df = pd.DataFrame({
'subA': [54,98,70,91,38],
'subB': [25,26,30,93,30],
'subC': [43,89,56,50,48]})
a = df[df['subA'] == 70]
print(a)
The output was as follow:
subA subB subC
2 70 30 56
This is expected and totally understandable. Now I want to select the rows where first two columns have some specific value. For example I changed the code as follow:
df = pd.DataFrame({
'subA': [54,98,70,91,38],
'subB': [25,26,30,93,30],
'subC': [43,89,56,50,48]})
my_sub = ['subA', 'subB']
my_marks = [54, 25]
a = df[df[my_sub] == my_marks]
print(a)
I was expecting to see results like this:
subA subB subC
1 54 25 43
But instead the output is full of NaN values which is not clear to me:
subA subB subC
0 54.0 25.0 NaN
1 NaN NaN NaN
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN
What I am missing here to have the desired output? I also tried .loc and iloc but those did not help.
You can use all
to make it possible boolean indexing
df[(df[my_sub] == my_marks).all(axis=1)]
subA subB subC
0 54 25 43
Or using eq
and all
as @ansev said
df[df[my_sub].eq(my_marks).all(axis=1)]
subA subB subC
0 54 25 43
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With