I have two Pandas DataFrames (A and B) with overlapping indexes. I want to drop the rows in DataFrame B for which that index value doesn't exist in DataFrame A.
I have looked into Pandas drop method for DataFrames but it drops the columns with the given labels, I would like to drop the columns for which there are not given labels.
For now I managed to do this with:
B.drop(B.drop(A.index).index)
But this is clearly not the best way to do it (not efficient or readable). Is there a better way?
Example:
DataFrame A:
index col1
1 some_data
2 some_data
3 some_data
4 some_data
DataFrame B:
index col2
1 other_data
2 other_data
3 other_data
4 other_data
5 other_data
6 other_data
I want to obtain DataFrame B':
index col2
1 other_data
2 other_data
3 other_data
4 other_data
I think you can use isin
with boolean indexing
:
print B.index.isin(A.index)
[ True True True True False False]
print B[B.index.isin(A.index)]
col2
index
1 other_data
2 other_data
3 other_data
4 other_data
You can use difference
to find the row labels that are not in the other df index:
In [6]:
df2.drop(df2.index.difference(df1.index))
Out[6]:
col2
index
1 other_data
2 other_data
3 other_data
4 other_data
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With