Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

vlookup in Pandas using join

I have the following 2 dataframes

Example1 sku loc flag   122  61 True  123  61 True 113  62 True  122  62 True  123  62 False 122  63 False 301  63 True   Example2  sku dept  113 a 122 b 123 b 301 c  

I want to perform a merge, or join opertation using Pandas (or whichever Python operator is best) to produce the below data frame.

Example3 sku loc flag   dept   122  61 True   b 123  61 True   b 113  62 True   a 122  62 True   b 123  62 False  b 122  63 False  b 301  63 True   c  Both  df_Example1.join(df_Example2,lsuffix='_ProdHier') df_Example1.join(df_Example2,how='outer',lsuffix='_ProdHier') 

Aren't working. What am I doing wrong?

like image 222
Alex Kinman Avatar asked Aug 25 '14 20:08

Alex Kinman


People also ask

How do you do a VLOOKUP with pandas?

We can use merge() function to perform Vlookup in pandas. The merge function does the same job as the Join in SQL We can perform the merge operation with respect to table 1 or table 2. There can be different ways of merging the 2 tables.

How do you do a join in pandas DataFrame?

Join DataFrames using their indexes. If we want to join using the key columns, we need to set key to be the index in both df and other . The joined DataFrame will have key as its index. Another option to join using the key columns is to use the on parameter.


1 Answers

Perform a left merge, this will use sku column as the column to join on:

In [26]:  df.merge(df1, on='sku', how='left') Out[26]:    sku  loc   flag dept 0  122   61   True    b 1  122   62   True    b 2  122   63  False    b 3  123   61   True    b 4  123   62  False    b 5  113   62   True    a 6  301   63   True    c 

If sku is in fact your index then do this:

In [28]:  df.merge(df1, left_index=True, right_index=True, how='left') Out[28]:      loc   flag dept sku                  113   62   True    a 122   61   True    b 122   62   True    b 122   63  False    b 123   61   True    b 123   62  False    b 301   63   True    c 

Another method is to use map, if you set sku as the index on your second df, so in effect it becomes a Series then the code simplifies to this:

In [19]:  df['dept']=df.sku.map(df1.dept) df Out[19]:    sku  loc   flag dept 0  122   61   True    b 1  123   61   True    b 2  113   62   True    a 3  122   62   True    b 4  123   62  False    b 5  122   63  False    b 6  301   63   True    c 
like image 88
EdChum Avatar answered Oct 02 '22 10:10

EdChum