Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Forcing pandas .iloc to return a single-row dataframe?

For programming purpose, I want .iloc to consistently return a data frame, even when the resulting data frame has only one row. How to accomplish this?

Currently, .iloc returns a Series when the result only has one row. Example:

In [1]: df = pd.DataFrame({'a':[1,2], 'b':[3,4]})  In [2]: df Out[2]:    a  b 0  1  3 1  2  4  In [3]: type(df.iloc[0, :]) Out[3]: pandas.core.series.Series 

This behavior is poor for 2 reasons:

  • Depending on the number of chosen rows, .iloc can either return a Series or a Data Frame, forcing me to manually check for this in my code

- .loc, on the other hand, always return a Data Frame, making pandas inconsistent within itself (wrong info, as pointed out in the comment)

For the R user, this can be accomplished with drop = FALSE, or by using tidyverse's tibble, which always return a data frame by default.

like image 707
Heisenberg Avatar asked Aug 31 '17 20:08

Heisenberg


People also ask

How do I extract a single row from a DataFrame?

In the Pandas DataFrame we can find the specified row value with the using function iloc(). In this function we pass the row number as parameter.

How do I retrieve a row in pandas?

To get the nth row in a Pandas DataFrame, we can use the iloc() method. For example, df. iloc[4] will return the 5th row because row numbers start from 0.

How do I drop a row using ILOC?

Use iloc to get the row as a Series, then get the row's index as the 'name' attribute of the Series. Then use the index to drop.


2 Answers

Use double brackets,

df.iloc[[0]] 

Output:

   a  b 0  1  3  print(type(df.iloc[[0]])  <class 'pandas.core.frame.DataFrame'> 

Short for df.iloc[[0],:]

like image 154
Scott Boston Avatar answered Sep 28 '22 00:09

Scott Boston


Accessing row(s) by label: loc

# Setup df = pd.DataFrame({'X': [1, 2, 3], 'Y':[4, 5, 6]}, index=['a', 'b', 'c']) df             X  Y a  1  4 b  2  5 c  3  6 

To get a DataFrame instead of a Series, pass a list of indices of length 1,

df.loc[['a']] # Same as df.loc[['a'], :] # selects all columns     X  Y a  1  4 

To select multiple specific rows, use

df.loc[['a', 'c']]      X  Y a  1  4 c  3  6 

To select a contiguous range of rows, use

df.loc['b':'c']      X  Y b  2  5 c  3  6 

Access row(s) by position: iloc

Specify a list of indices of length 1,

i = 1 df.iloc[[i]]     X  Y b  2  5 

Or, specify a slice of length 1:

df.iloc[i:i+1]      X  Y b  2  5 

To select multiple rows or a contiguous slice you'd use a similar syntax as with loc.

like image 24
cs95 Avatar answered Sep 27 '22 22:09

cs95