I know that there are several ways to build up a dataframe in Pandas. My question is simply to understand why the method below doesn't work.
First, a working example. I can create an empty dataframe and then append a new one similar to the documenta
In [3]: df1 = pd.DataFrame([[1,2],], columns = ['a', 'b'])
...: df2 = pd.DataFrame()
...: df2.append(df1)
Out[3]: a b
0 1 2
However, if I do the following df2 becomes None:
In [10]: df1 = pd.DataFrame([[1,2],], columns = ['a', 'b'])
...: df2 = pd.DataFrame()
...: for i in range(10):
...: df2.append(df1)
In [11]: df2
Out[11]:
Empty DataFrame
Columns: []
Index: []
Can someone explain why it works this way? Thanks!
This happens because the .append() method returns a new df:
Pandas Docs (0.19.2):
pandas.DataFrame.append
Returns: appended: DataFrame
Here's a working example so you can see what's happening in each iteration of the loop:
df1 = pd.DataFrame([[1,2],], columns=['a','b'])
df2 = pd.DataFrame()
for i in range(0,2):
print(df2.append(df1))
> a b
> 0 1 2
> a b
> 0 1 2
If you assign the output of .append() to a df (even the same one) you'll get what you probably expected:
for i in range(0,2):
df2 = df2.append(df1)
print(df2)
> a b
> 0 1 2
> 0 1 2
I think what you are looking for is:
df1 = pd.DataFrame()
df2 = pd.DataFrame([[1,2,3],], columns=['a','b','c'])
for i in range(0,4):
df1 = df1.append(df2)
df1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With