I know that there are several ways to build up a dataframe in Pandas. My question is simply to understand why the method below doesn't work.
First, a working example. I can create an empty dataframe and then append a new one similar to the documenta
In [3]: df1 = pd.DataFrame([[1,2],], columns = ['a', 'b'])
   ...: df2 = pd.DataFrame()    
   ...: df2.append(df1)   
Out[3]:   a  b
        0  1  2
However, if I do the following df2 becomes None:
In [10]: df1 = pd.DataFrame([[1,2],], columns = ['a', 'b'])
    ...: df2 = pd.DataFrame()
    ...: for i in range(10):
    ...:     df2.append(df1)
In [11]: df2
Out[11]:
Empty DataFrame
Columns: []
Index: []
Can someone explain why it works this way? Thanks!
This happens because the .append() method returns a new df:
Pandas Docs (0.19.2):
pandas.DataFrame.append
Returns: appended: DataFrame
Here's a working example so you can see what's happening in each iteration of the loop:
df1 = pd.DataFrame([[1,2],], columns=['a','b'])
df2 = pd.DataFrame()
for i in range(0,2):
    print(df2.append(df1))
>    a  b
> 0  1  2
>    a  b
> 0  1  2
If you assign the output of .append() to a df (even the same one) you'll get what you probably expected:
for i in range(0,2):
    df2 = df2.append(df1)
print(df2)
>    a  b
> 0  1  2
> 0  1  2
                        I think what you are looking for is:
df1 = pd.DataFrame()
df2 = pd.DataFrame([[1,2,3],], columns=['a','b','c'])
for i in range(0,4):
    df1 = df1.append(df2)
df1
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With