Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to add several columns at once to a pandas DataFrame?

Tags:

If I want to create a new DataFrame with several columns, I can add all the columns at once -- for example, as follows:

data = {'col_1': [0, 1, 2, 3],         'col_2': [4, 5, 6, 7]} df = pd.DataFrame(data) 

But now suppose farther down the road I want to add a set of additional columns to this DataFrame. Is there a way to add them all simultaneously, as in

additional_data = {'col_3': [8, 9, 10, 11],                    'col_4': [12, 13, 14, 15]} #Below is a made-up function of the kind I desire. df.add_data(additional_data) 

I'm aware I could do this:

for key, value in additional_data.iteritems():     df[key] = value 

Or this:

df2 = pd.DataFrame(additional_data, index=df.index) df = pd.merge(df, df2, on=df.index) 

I was just hoping for something cleaner. If I'm stuck with these two options, which is preferred?

like image 970
dbliss Avatar asked Nov 08 '13 18:11

dbliss


People also ask

How do you add multiple columns in Python?

import pandas as pd df = {'col_1': [0, 1, 2, 3], 'col_2': [4, 5, 6, 7]} df = pd. DataFrame(df) df[[ 'column_new_1', 'column_new_2','column_new_3']] = [np. nan, 'dogs',3] #thought this would work here...

Can you add columns to a DataFrame?

You can add the new column to a pandas DataFrame using a dictionary. The keys of the dictionary should be the values of the existing column and the values to those keys will be the values of the new column. After making the dictionary, pass its values as the new column to the DataFrame.


2 Answers

Pandas has assign method since 0.16.0. You could use it on dataframes like

In [1506]: df1.assign(**df2) Out[1506]:    col_1  col_2  col_3  col_4 0      0      4      8     12 1      1      5      9     13 2      2      6     10     14 3      3      7     11     15 

or, you could directly use the dictionary like

In [1507]: df1.assign(**additional_data) Out[1507]:    col_1  col_2  col_3  col_4 0      0      4      8     12 1      1      5      9     13 2      2      6     10     14 3      3      7     11     15 
like image 91
Zero Avatar answered Sep 18 '22 22:09

Zero


What you need is the join function:

df1.join(df2, how='outer') #or df1.join(df2) # this works also 

Example:

data = {'col_1': [0, 1, 2, 3],     'col_2': [4, 5, 6, 7]} df1 = pd.DataFrame(data)  additional_data = {'col_3': [8, 9, 10, 11],                'col_4': [12, 13, 14, 15]} df2 = pd.DataFrame(additional_data)  df1.join(df2, how='outer') 

output:

   col_1  col_2  col_3  col_4 0      0      4      8     12 1      1      5      9     13 2      2      6     10     14 3      3      7     11     15 
like image 41
min Avatar answered Sep 20 '22 22:09

min