The easiest way to add or insert a new row into a Pandas DataFrame is to use the Pandas . append() method. The . append() method is a helper method, for the Pandas concat() function.
Use concat() to Add a Row at Top of DataFrame concat([new_row,df. loc[:]]). reset_index(drop=True) to add the row to the first position of the DataFrame as Index starts from zero. reset_index() will reset the Index on the DataFrame to adjust the indexes on other rows.
Just assign row to a particular index, using loc
:
df.loc[-1] = [2, 3, 4] # adding a row
df.index = df.index + 1 # shifting index
df = df.sort_index() # sorting by index
And you get, as desired:
A B C
0 2 3 4
1 5 6 7
2 7 8 9
See in Pandas documentation Indexing: Setting with enlargement.
Not sure how you were calling concat()
but it should work as long as both objects are of the same type. Maybe the issue is that you need to cast your second vector to a dataframe? Using the df that you defined the following works for me:
df2 = pd.DataFrame([[2,3,4]], columns=['A','B','C'])
pd.concat([df2, df])
One way to achieve this is
>>> pd.DataFrame(np.array([[2, 3, 4]]), columns=['A', 'B', 'C']).append(df, ignore_index=True)
Out[330]:
A B C
0 2 3 4
1 5 6 7
2 7 8 9
Generally, it's easiest to append dataframes, not series. In your case, since you want the new row to be "on top" (with starting id), and there is no function pd.prepend()
, I first create the new dataframe and then append your old one.
ignore_index
will ignore the old ongoing index in your dataframe and ensure that the first row actually starts with index 1
instead of restarting with index 0
.
Typical Disclaimer: Cetero censeo ... appending rows is a quite inefficient operation. If you care about performance and can somehow ensure to first create a dataframe with the correct (longer) index and then just inserting the additional row into the dataframe, you should definitely do that. See:
>>> index = np.array([0, 1, 2])
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[0:1] = [list(s1), list(s2)]
>>> df2
Out[336]:
A B C
0 5 6 7
1 7 8 9
2 NaN NaN NaN
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[1:] = [list(s1), list(s2)]
So far, we have what you had as df
:
>>> df2
Out[339]:
A B C
0 NaN NaN NaN
1 5 6 7
2 7 8 9
But now you can easily insert the row as follows. Since the space was preallocated, this is more efficient.
>>> df2.loc[0] = np.array([2, 3, 4])
>>> df2
Out[341]:
A B C
0 2 3 4
1 5 6 7
2 7 8 9
I put together a short function that allows for a little more flexibility when inserting a row:
def insert_row(idx, df, df_insert):
dfA = df.iloc[:idx, ]
dfB = df.iloc[idx:, ]
df = dfA.append(df_insert).append(dfB).reset_index(drop = True)
return df
which could be further shortened to:
def insert_row(idx, df, df_insert):
return df.iloc[:idx, ].append(df_insert).append(df.iloc[idx:, ]).reset_index(drop = True)
Then you could use something like:
df = insert_row(2, df, df_new)
where 2
is the index position in df
where you want to insert df_new
.
It is pretty simple to add a row into a pandas DataFrame
:
Create a regular Python dictionary with the same columns names as your Dataframe
;
Use pandas.append()
method and pass in the name of your dictionary, where .append()
is a method on DataFrame instances;
Add ignore_index=True
right after your dictionary name.
We can use numpy.insert
. This has the advantage of flexibility. You only need to specify the index you want to insert to.
s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])
df = pd.DataFrame([list(s1), list(s2)], columns = ["A", "B", "C"])
pd.DataFrame(np.insert(df.values, 0, values=[2, 3, 4], axis=0))
0 1 2
0 2 3 4
1 5 6 7
2 7 8 9
For np.insert(df.values, 0, values=[2, 3, 4], axis=0)
, 0 tells the function the place/index you want to place the new values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With