I have two DataFrames with the same indexing and want to append the second to the first. Lets say I have:
df1 = pd.DataFrame([1,2,3], index = [2,3,4])
df2 = pd.DataFrame([3,5,3], index = [2,3,4])
df1 = df1.append(df2)
which returns
0
2 1
3 2
4 3
2 3
3 5
4 3
But I want it to append a new column where the indexes match:
2 1 3
3 2 5
4 3 3
Is there a way to do this?
append() function is used to append rows of other dataframe to the end of the given dataframe, returning a new dataframe object. Columns not in the original dataframes are added as new columns and the new cells are populated with NaN value.
Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.
You can simply assign the values of your Series into the existing DataFrame to add a new column: series = pd. Series([40, 38, 32.5, 27, 30], index=[0, 1, 2, 3, 4])
Create a new column by assigning the output to the DataFrame with a new column name in between the [] . Operations are element-wise, no need to loop over rows. Use rename with a dictionary or function to rename row labels or column names.
Use concat
and pass param axis=1
to concatenate the list of dfs column-wise:
In [3]:
df1 = pd.DataFrame([1,2,3], index = [2,3,4])
df2 = pd.DataFrame([3,5,3], index = [2,3,4])
pd.concat([df1,df2], axis=1)
Out[3]:
0 0
2 1 3
3 2 5
4 3 3
You can also use join
but you have to rename the column first:
In [6]:
df1.join(df2.rename(columns={0:'x'}))
Out[6]:
0 x
2 1 3
3 2 5
4 3 3
Or merge
specifying that you wish to match on indices:
In [8]:
df1.merge(df2.rename(columns={0:'x'}), left_index=True, right_index=True )
Out[8]:
0 x
2 1 3
3 2 5
4 3 3
If the indexes match exactly and there's only one column in the other DataFrame (like your question has), then you could even just add the other DataFrame as a new column.
>>> df1['new_column'] = df2
>>> df1
0 new_column
2 1 3
3 2 5
4 3 3
In general, the concat
approach is better. If you have different indexes, you can choose to do an inner join
or outer join
.
>>> df2 = pd.DataFrame([3,5,3], index = [2,3,5])
>>> df2
0
2 3
3 5
5 3
>>> pd.concat([df1, df2], axis=1, join='inner')
0 0
2 1 3
3 2 5
>>> pd.concat([df1, df2], axis=1, join='outer')
0 0
2 1 3
3 2 5
4 3 NaN
5 NaN 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With