Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select subset of Data Frame rows based on a list in Pandas

Tags:

python

pandas

I have a data frame df1 and list x:

In [22] : import pandas as pd
In [23]: df1 = pd.DataFrame({'C': range(5), "B":range(10,20,2), "A":list('abcde')})
In [24]: df1
Out[24]:
   A   B  C
0  a  10  0
1  b  12  1
2  c  14  2
3  d  16  3
4  e  18  4

In [25]: x = ["b","c","g","h","j"]

What I want to do is to select rows in data frame based on the list. Returning

   A   B  C
1  b  12  1
2  c  14  2

What's the way to do it? I tried this but failed.

df1.join(pd.DataFrame(x),how="inner")
like image 334
pdubois Avatar asked Jan 29 '15 09:01

pdubois


People also ask

How to select rows in pandas Dataframe based on conditions?

Selecting rows in pandas DataFrame based on conditions Selecting rows based on particular column value using '>', '=', '=', '<=', '!=' operator. Selecting those rows whose column value is present in the list using isin() method of the dataframe. Selecting rows based on multiple column conditions using '&' operator.

How to select a subset of columns and rows from a Dataframe?

In this article, we are going to discuss how to select a subset of columns and rows from a DataFrame. We are going to use the nba.csv dataset to perform all operations. Below are various operations by using which we can select a subset for a given dataframe: To select a single column, we can use a square bracket [ ]:

What do the square brackets mean in a pandas Dataframe?

The inner square brackets define a Python list with column names, whereas the outer brackets are used to select the data from a pandas DataFrame as seen in the previous example. The returned data type is a pandas DataFrame:

How do I select multiple columns in a Dataframe in Python?

To select multiple columns, use a list of column names within the selection brackets []. The inner square brackets define a Python list with column names, whereas the outer brackets are used to select the data from a pandas DataFrame as seen in the previous example. The returned data type is a pandas DataFrame:


1 Answers

Use isin to return a boolean index for you to index into your df:

In [152]:

df1[df1['A'].isin(x)]
Out[152]:
   A   B  C
1  b  12  1
2  c  14  2

This is what isin is returning:

In [153]:

df1['A'].isin(x)
Out[153]:
0    False
1     True
2     True
3    False
4    False
Name: A, dtype: bool
like image 50
EdChum Avatar answered Oct 05 '22 18:10

EdChum