Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mapping columns from one dataframe to another to create a new column [duplicate]

i have a dataframe

id  store    address 1    100        xyz 2    200        qwe 3    300        asd 4    400        zxc 5    500        bnm 

i have another dataframe df2

serialNo    store_code  warehouse     1          300         Land     2          500         Sea     3          100         Land     4          200         Sea     5          400         Land 

I want my final dataframe to look like:

id  store    address  warehouse 1    100        xyz     Land 2    200        qwe     Sea 3    300        asd     Land 4    400        zxc     Land 5    500        bnm     Sea 

i.e map from one dataframe onto another creating new column

like image 689
Shubham R Avatar asked Sep 05 '17 07:09

Shubham R


People also ask

How do you create a new column in DataFrame based on another column?

Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.

How do you create a DataFrame with columns from another DataFrame?

You can create a new DataFrame of a specific column by using DataFrame. assign() method. The assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.


2 Answers

df.merge

out = (df1.merge(df2, left_on='store', right_on='store_code')           .reindex(columns=['id', 'store', 'address', 'warehouse'])) print(out)     id  store address warehouse 0   1    100     xyz      Land 1   2    200     qwe       Sea 2   3    300     asd      Land 3   4    400     zxc      Land 4   5    500     bnm       Sea 

pd.concat + df.sort_values

u = df1.sort_values('store') v = df2.sort_values('store_code')[['warehouse']].reset_index(drop=1) out = pd.concat([u, v], 1)  print(out)     id  store address warehouse 0   1    100     xyz      Land 1   2    200     qwe       Sea 2   3    300     asd      Land 3   4    400     zxc      Land 4   5    500     bnm       Sea      

The first sort call is redundant assuming your dataframe is already sorted on store, in which case you may remove it.


df.replace/df.map

s = df1.store.replace(df2.set_index('store_code')['warehouse']) print(s)  0    Land 1     Sea 2    Land 3    Land 4     Sea  df1['warehouse'] = s print(df1)     id  store address warehouse 0   1    100     xyz      Land 1   2    200     qwe       Sea 2   3    300     asd      Land 3   4    400     zxc      Land 4   5    500     bnm       Sea 

Alternatively, create a mapping explicitly. This works if you want to use it later.

mapping = dict(df2[['store_code', 'warehouse']].values) df1['warehouse'] = df1.store.map(mapping) print(df1)     id  store address warehouse 0   1    100     xyz      Land 1   2    200     qwe       Sea 2   3    300     asd      Land 3   4    400     zxc      Land 4   5    500     bnm       Sea 
like image 95
cs95 Avatar answered Oct 11 '22 18:10

cs95


Use map or join:

df1['warehouse'] = df1['store'].map(df2.set_index('store_code')['warehouse']) print (df1)    id  store address warehouse 0   1    100     xyz      Land 1   2    200     qwe       Sea 2   3    300     asd      Land 3   4    400     zxc      Land 4   5    500     bnm       Sea 

df1 = df1.join(df2.set_index('store_code'), on=['store']).drop('serialNo', 1) print (df1)    id  store address warehouse 0   1    100     xyz      Land 1   2    200     qwe       Sea 2   3    300     asd      Land 3   4    400     zxc      Land 4   5    500     bnm       Sea 
like image 38
jezrael Avatar answered Oct 11 '22 18:10

jezrael