Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I append pandas dataframe in a loop

I know that there are several ways to build up a dataframe in Pandas. My question is simply to understand why the method below doesn't work.

First, a working example. I can create an empty dataframe and then append a new one similar to the documenta

In [3]: df1 = pd.DataFrame([[1,2],], columns = ['a', 'b'])
   ...: df2 = pd.DataFrame()    
   ...: df2.append(df1)   

Out[3]: a b 0 1 2

However, if I do the following df2 becomes None:

In [10]: df1 = pd.DataFrame([[1,2],], columns = ['a', 'b'])
    ...: df2 = pd.DataFrame()
    ...: for i in range(10):
    ...:     df2.append(df1)

In [11]: df2
Out[11]:
Empty DataFrame
Columns: []
Index: []

Can someone explain why it works this way? Thanks!

like image 885
mcragun Avatar asked May 05 '17 04:05

mcragun


2 Answers

This happens because the .append() method returns a new df:

Pandas Docs (0.19.2):

pandas.DataFrame.append

Returns: appended: DataFrame

Here's a working example so you can see what's happening in each iteration of the loop:

df1 = pd.DataFrame([[1,2],], columns=['a','b'])
df2 = pd.DataFrame()
for i in range(0,2):
    print(df2.append(df1))

>    a  b
> 0  1  2
>    a  b
> 0  1  2

If you assign the output of .append() to a df (even the same one) you'll get what you probably expected:

for i in range(0,2):
    df2 = df2.append(df1)
print(df2)

>    a  b
> 0  1  2
> 0  1  2
like image 170
Rod Manning Avatar answered Sep 27 '22 21:09

Rod Manning


I think what you are looking for is:

df1 = pd.DataFrame()
df2 = pd.DataFrame([[1,2,3],], columns=['a','b','c'])


for i in range(0,4):
    df1 = df1.append(df2)

df1
like image 38
TheManWhoKnows Avatar answered Sep 27 '22 20:09

TheManWhoKnows