Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding a new column in pandas dataframe from another dataframe with differing indices

This is my original dataframe. Original dataframe This is my second dataframe containing one column. second datframe I want to add the column of second dataframe to the original dataframe at the end.Indices are different for both dataframes. I did like this

feature_file_df['RESULT']=RESULT_df['RESULT']

Result column got added but all values are NaN's Added result

How to add columns with value

like image 736
Jayashree Avatar asked Sep 25 '17 00:09

Jayashree


People also ask

How do you add a column to a Dataframe from a different Dataframe?

After extraction, the column needs to be simply added to the second dataframe using join() function. This function needs to be called with reference to the dataframe in which the column has to be added and the variable name which stores the extracted column name has to be passed to it as the argument.

How do I add multiple columns from one Dataframe to another in Pandas?

import pandas as pd df = {'col_1': [0, 1, 2, 3], 'col_2': [4, 5, 6, 7]} df = pd. DataFrame(df) df[[ 'column_new_1', 'column_new_2','column_new_3']] = [np. nan, 'dogs',3] #thought this would work here...


1 Answers

Assuming the size of your dataframes are the same, you can assign the RESULT_df['RESULT'].values to your original dataframe. This way, you don't have to worry about indexing issues.

# pre 0.24
feature_file_df['RESULT'] = RESULT_df['RESULT'].values
# >= 0.24
feature_file_df['RESULT'] = RESULT_df['RESULT'].to_numpy()

Minimal Code Sample

df
          A         B
0 -1.202564  2.786483
1  0.180380  0.259736
2 -0.295206  1.175316
3  1.683482  0.927719
4 -0.199904  1.077655

df2

           C
11 -0.140670
12  1.496007
13  0.263425
14 -0.557958
15 -0.018375

Let's try direct assignment first.

df['C'] = df2['C']
df

          A         B   C
0 -1.202564  2.786483 NaN
1  0.180380  0.259736 NaN
2 -0.295206  1.175316 NaN
3  1.683482  0.927719 NaN
4 -0.199904  1.077655 NaN

Now, assign the array returned by .values (or .to_numpy() for pandas versions >0.24). .values returns a numpy array which does not have an index.

df2['C'].values 
array([-0.141,  1.496,  0.263, -0.558, -0.018])

df['C'] = df2['C'].values
df

          A         B         C
0 -1.202564  2.786483 -0.140670
1  0.180380  0.259736  1.496007
2 -0.295206  1.175316  0.263425
3  1.683482  0.927719 -0.557958
4 -0.199904  1.077655 -0.018375
like image 90
cs95 Avatar answered Oct 16 '22 19:10

cs95