Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Append to a DataFrame in Pandas as new column

Tags:

python

pandas

I have two DataFrames with the same indexing and want to append the second to the first. Lets say I have:

df1 = pd.DataFrame([1,2,3], index = [2,3,4])
df2 = pd.DataFrame([3,5,3], index = [2,3,4])
df1 = df1.append(df2)

which returns

   0
2  1
3  2
4  3
2  3
3  5
4  3

But I want it to append a new column where the indexes match:

2  1  3
3  2  5
4  3  3

Is there a way to do this?

like image 745
TheStrangeQuark Avatar asked Aug 06 '15 17:08

TheStrangeQuark


People also ask

How do you append values to a new column in a DataFrame?

append() function is used to append rows of other dataframe to the end of the given dataframe, returning a new dataframe object. Columns not in the original dataframes are added as new columns and the new cells are populated with NaN value.

How do I create a new column in pandas based on another column?

Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.

How do you add a series to a DataFrame as a column?

You can simply assign the values of your Series into the existing DataFrame to add a new column: series = pd. Series([40, 38, 32.5, 27, 30], index=[0, 1, 2, 3, 4])

How do you create new columns derived from existing rows?

Create a new column by assigning the output to the DataFrame with a new column name in between the [] . Operations are element-wise, no need to loop over rows. Use rename with a dictionary or function to rename row labels or column names.


2 Answers

Use concat and pass param axis=1 to concatenate the list of dfs column-wise:

In [3]:

df1 = pd.DataFrame([1,2,3], index = [2,3,4])
df2 = pd.DataFrame([3,5,3], index = [2,3,4])
pd.concat([df1,df2], axis=1)
Out[3]:
   0  0
2  1  3
3  2  5
4  3  3

You can also use join but you have to rename the column first:

In [6]:

df1.join(df2.rename(columns={0:'x'}))
Out[6]:
   0  x
2  1  3
3  2  5
4  3  3

Or merge specifying that you wish to match on indices:

In [8]:

df1.merge(df2.rename(columns={0:'x'}), left_index=True, right_index=True )
Out[8]:
   0  x
2  1  3
3  2  5
4  3  3
like image 77
EdChum Avatar answered Sep 20 '22 17:09

EdChum


If the indexes match exactly and there's only one column in the other DataFrame (like your question has), then you could even just add the other DataFrame as a new column.

>>> df1['new_column'] = df2
>>> df1
   0  new_column
2  1           3
3  2           5
4  3           3

In general, the concat approach is better. If you have different indexes, you can choose to do an inner join or outer join.

>>> df2 = pd.DataFrame([3,5,3], index = [2,3,5])
>>> df2
   0
2  3
3  5
5  3

>>> pd.concat([df1, df2], axis=1, join='inner')
   0  0
2  1  3
3  2  5

>>> pd.concat([df1, df2], axis=1, join='outer')
    0   0
2   1   3
3   2   5
4   3 NaN
5 NaN   3
like image 45
vk1011 Avatar answered Sep 20 '22 17:09

vk1011