I am trying to extract rows from a Pandas dataframe using a list of row names, but it can't be done. Here is an example
# df
alleles chrom pos strand assembly# center protLSID assayLSID
rs#
TP3 A/C 0 3 + NaN NaN NaN NaN
TP7 A/T 0 7 + NaN NaN NaN NaN
TP12 T/A 0 12 + NaN NaN NaN NaN
TP15 C/A 0 15 + NaN NaN NaN NaN
TP18 C/T 0 18 + NaN NaN NaN NaN
test = ['TP3','TP12','TP18']
df.select(test)
This is what I was trying to do with just element of the list and I am getting this error TypeError: 'Index' object is not callable
. What am I doing wrong?
You can select rows from a list of Index in pandas DataFrame either using DataFrame. iloc[] , DataFrame. loc[df. index[]] .
Use pandas. DataFrame. isin() to filter a DataFrame using a list.
iloc selects rows based on an integer index. So, if you want to select the 5th row in a DataFrame, you would use df. iloc[[4]] since the first row is at index 0, the second row is at index 1, and so on.
To select a single value from the DataFrame, you can do the following. You can use slicing to select a particular column. To select rows and columns simultaneously, you need to understand the use of comma in the square brackets.
You can use df.loc[['TP3','TP12','TP18']]
Here is a small example:
In [26]: df = pd.DataFrame({"a": [1,2,3], "b": [3,4,5], "c": [5,6,7]})
In [27]: df.index = ["x", "y", "z"]
In [28]: df
Out[28]:
a b c
x 1 3 5
y 2 4 6
z 3 5 7
[3 rows x 3 columns]
In [29]: df.loc[["x", "y"]]
Out[29]:
a b c
x 1 3 5
y 2 4 6
[2 rows x 3 columns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With