Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - function similar to VLOOKUP (Excel)

i am trying to join two data frames but cannot get my head around the possibilities Python has to offer.

First dataframe:

ID MODEL   REQUESTS ORDERS
1  Golf    123      4
2  Passat  34       5
3  Model 3 500      8
4  M3      5        0

Second dataframe:

MODEL   TYPE  MAKE
Golf    Sedan Volkswagen
M3      Coupe BMW
Model 3 Sedan Tesla

What I want is to add another column in the first dataframe called "make" so that it looks like this:

ID MODEL   MAKE       REQUESTS ORDERS
1  Golf    Volkswagen 123      4
2  Passat  Volkswagen 34       5
3  Model 3 Tesla      500      8
4  M3      BMW        5        0

I already looked at merge, join and map but all examples just appended the required information at the end of the dataframe.

like image 699
Christian Avatar asked Jan 06 '17 18:01

Christian


People also ask

What function is similar to VLOOKUP?

The superior alternative to VLOOKUP is INDEX MATCH. While VLOOKUP works fine in most cases, it tends to reveal flaws when you start using it in large and complex sheets. The INDEX MATCH formula is actually two different functions; INDEX and MATCH. array is range of cells or an array constant.

How do you do a VLOOKUP without pandas in Python?

To do this without pandas (and assuming you know the structure of your data + it fits in memory), you can iterate through the csv file and store the results in a dictionary, where you fill the entries where the ID maps to the other information that you want to keep.

How do you look up in Python?

DataFrame - lookup() function The lookup() function returns label-based "fancy indexing" function for DataFrame. Given equal-length arrays of row and column labels, return an array of the values corresponding to each (row, col) pair.

Is VLOOKUP still better compare to Xlookup?

XLOOKUP is much more flexible than VLOOKUP, which can look up values only in the leftmost column of a table, and return values from corresponding columns on the right, as we saw in the example above. In contrast, the XLOOKUP model requires simpler steps and can return values for any column in either direction.


2 Answers

I think you can use insert with map by Series created with df2 (if some value in column MODEL in df2 is missing get NaN):

df1.insert(2, 'MAKE', df1['MODEL'].map(df2.set_index('MODEL')['MAKE']))
print (df1)
   ID    MODEL        MAKE  REQUESTS  ORDERS
0   1     Golf  Volkswagen       123       4
1   2   Passat         NaN        34       5
2   3  Model 3       Tesla       500       8
3   4       M3         BMW         5       0
like image 150
jezrael Avatar answered Oct 16 '22 02:10

jezrael


Although not in this case, but there might be scenarios where df2 has more than two columns and you would just want to add one out of those to df1 based on a specific column as key. Here is a generic code that you may find useful.

df = pd.merge(df1, df2[['MODEL', 'MAKE']], on = 'MODEL', how = 'left')
like image 22
Bhagabat Behera Avatar answered Oct 16 '22 02:10

Bhagabat Behera