I have a pandas DataFrame called <code>df</code> where <code>df.shape</code> is <code>(53, 80)</code> where indexes and columns are both <code>int</code>. If I select the first row like this, I get : <pre class="prettyprint"><code>df.loc[0].shape (80,) </code></pre> instead of : <pre class="prettyprint"><code>(1,80) </code></pre> But then <code>df.loc[0:0].shape</code> or <code>df[0:1].shape</code> both show the correct shape.

When you call <code>df.iloc[0]</code>, it is selecting first row and type is <code>Series</code> whereas, in other case <code>df.iloc[0:0]</code> it is slicing rows and is of type <code>dataframe</code>. And <code>Series</code> are according to pandas Series documentation : <blockquote> One-dimensional ndarray with axis labels </blockquote> whereas <code>dataframe</code> are Two-dimensional (pandas Dataframe documentation). Try running following lines to see the difference: <pre class="prettyprint"><code>print(type(df.iloc[0])) # <class 'pandas.core.series.Series'> print(type(df.iloc[0:0])) # <class 'pandas.core.frame.DataFrame'> </code></pre>

Why does the shape of the selection of my pandas dataframe is wrong

Tags:

python

slice

pandas

dataframe

shape

I have a pandas DataFrame called df where df.shape is (53, 80) where indexes and columns are both int.

If I select the first row like this, I get :

df.loc[0].shape
(80,)

instead of :

(1,80)

But then df.loc[0:0].shape or df[0:1].shape both show the correct shape.

559

asked Jul 09 '18 16:07

SebMa

2 Answers

df.loc[0] returns a one-dimensional pd.Series object representing the data in a single row, extracted via indexing.

df.loc[0:0] returns a two-dimensional pd.DataFrame object representing the data in a dataframe with one row, extracted via slicing.

You can see this more clearly if you print the results of these operations:

import pandas as pd, numpy as np

df = pd.DataFrame(np.arange(9).reshape(3, 3))

res1 = df.loc[0]
res2 = df.loc[0:0]

print(type(res1), res1, sep='\n')

<class 'pandas.core.series.Series'>
0    0
1    1
2    2
Name: 0, dtype: int32

print(type(res2), res2, sep='\n')

<class 'pandas.core.frame.DataFrame'>
   0  1  2
0  0  1  2

The convention follows NumPy indexing / slicing. This is natural since Pandas is built on NumPy arrays.

arr = np.arange(9).reshape(3, 3)

print(arr[0].shape)    # (3,), i.e. 1-dimensional
print(arr[0:0].shape)  # (0, 3), i.e. 2-dimensional

112

answered Nov 14 '22 23:11

jpp

When you call df.iloc[0], it is selecting first row and type is Series whereas, in other case df.iloc[0:0] it is slicing rows and is of type dataframe. And Series are according to pandas Series documentation :

One-dimensional ndarray with axis labels

whereas dataframe are Two-dimensional (pandas Dataframe documentation).

Try running following lines to see the difference:

print(type(df.iloc[0]))
# <class 'pandas.core.series.Series'>

print(type(df.iloc[0:0]))
# <class 'pandas.core.frame.DataFrame'>

answered Nov 15 '22 00:11

student

Related questions
                            
                                numpy - select multiple elements from each row of an array
                            
                                How to plot a vertical area plot with pandas
                            
                                One-liner to create dictionary of lists
                            
                                Using aws encryption SDK in python AWS lambda
                            
                                Why does pygame freeze for me? [duplicate]
                            
                                How do I determine which requirements are actually needed in setup.py?
                            
                                How to import a SQLite3 database into Python Jupyter Notebook?
                            
                                Replace cell values in each row of pandas column using for loop
                            
                                Space Complexity of Python List Slices [duplicate]
                            
                                Install spyder-vim in ubuntu 18.04
                            
                                Convert a complex SQL query to SQLAlchemy
                            
                                How do I set a value for a hidden field in a Flask form, using wtf.quick_form?
                            
                                How to uninstall (mini)conda entirely on Windows
                            
                                PyTorch Autograd automatic differentiation feature
                            
                                Retain environment of helper python script in main script
                            
                                Python: Dictionary that only retains last n inserted keys
                            
                                Change default expiry period of "Pyotp"
                            
                                Exclude null values in map function of Python3
                            
                                tf.assign to variable slice doesn't work inside tf.while_loop
                            
                                What does pyspark need psutil for? (faced "UserWarning: Please install psutil to have better support with spilling")?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With