Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select a list of rows by name in Pandas dataframe

I am trying to extract rows from a Pandas dataframe using a list of row names, but it can't be done. Here is an example

# df
    alleles  chrom  pos strand  assembly#  center  protLSID  assayLSID  
rs#
TP3      A/C      0    3      +        NaN     NaN       NaN        NaN
TP7      A/T      0    7      +        NaN     NaN       NaN        NaN
TP12     T/A      0   12      +        NaN     NaN       NaN        NaN
TP15     C/A      0   15      +        NaN     NaN       NaN        NaN
TP18     C/T      0   18      +        NaN     NaN       NaN        NaN

test = ['TP3','TP12','TP18']

df.select(test)

This is what I was trying to do with just element of the list and I am getting this error TypeError: 'Index' object is not callable. What am I doing wrong?

like image 380
upendra Avatar asked Dec 12 '15 04:12

upendra


People also ask

How do I select rows of pandas DataFrame based on a list?

You can select rows from a list of Index in pandas DataFrame either using DataFrame. iloc[] , DataFrame. loc[df. index[]] .

How do you filter rows from a DataFrame based on a list?

Use pandas. DataFrame. isin() to filter a DataFrame using a list.

How do I select specific rows in pandas based on index?

iloc selects rows based on an integer index. So, if you want to select the 5th row in a DataFrame, you would use df. iloc[[4]] since the first row is at index 0, the second row is at index 1, and so on.

How do I select specific rows and columns from a DataFrame?

To select a single value from the DataFrame, you can do the following. You can use slicing to select a particular column. To select rows and columns simultaneously, you need to understand the use of comma in the square brackets.


1 Answers

You can use df.loc[['TP3','TP12','TP18']]

Here is a small example:

In [26]: df = pd.DataFrame({"a": [1,2,3], "b": [3,4,5], "c": [5,6,7]})

In [27]: df.index = ["x", "y", "z"]

In [28]: df
Out[28]: 
   a  b  c
x  1  3  5
y  2  4  6
z  3  5  7

[3 rows x 3 columns]

In [29]: df.loc[["x", "y"]]
Out[29]: 
   a  b  c
x  1  3  5
y  2  4  6

[2 rows x 3 columns]
like image 116
Akavall Avatar answered Sep 30 '22 01:09

Akavall