how to do a vlookup in pandas with counts

Question

I have the two following input dataframes :

df_item_in_cat
      a     b     c     d
0     A     A     B     A
1     B     B     E     B
2     C     E     F     E
3     D     F     G     G

df_charac_by_cat
    cat      charac
0     a          10
1     b          20
2     c          25
3     d          15

And I would like to produce the below dataframe :

df
   item   cat_nb   sum_charac
0     A        3           45
1     B        4           70
2     C        1           10
3     D        1           10
4     E        2           45
5     F        2           45
6     G        2           40

where :

cat_nb is the nb of columns where each item appear on df_item_in_cat
sum_charac is the sum of charac from df_charac_by_cat of the different categories

If needed, I can get the Serie :

s_items

0  A
1  B
2  C
3  D
4  E
5  F
6  G

To be more concrete, I have products that belong to categories, and I have shops that are associated with a category. And I want to know in how many shops each product is available, based on the caterogy of each product, and the category of each shop. Hope it's clear ..

(a,b,c,d) are categories
(A,B,C,D,E,F,G) are products
(10,20,25,15) are the number of shops associated for each category

ex : product A that is in 3 categories (a,b,d), is available in 45 shops, because 10+20+15 = 45

It's like a "vlookup" in excel.

I'm pretty sure I could do it with an iterrows, but I'm looking for a "beautiful" solution.

I'd like to show you some researchs, but I really have no idea how to do that .. I tried with merges, or pivot, but it's not really usefull.

Scott Boston · Accepted Answer

You need:

df_item_in_cat.melt().merge(df_charac_by_cat, left_on='variable', right_on='cat')\
   .groupby('value')['charac'].agg(['count','sum'])

Output:

  value  count  sum
0     A      3   45
1     B      4   70
2     C      1   10
3     D      1   10
4     E      3   60
5     F      2   45
6     G      2   40

And, this is with all the column renaming and "housekeeping":

df_item_in_cat.melt(value_name='item').merge(df_charac_by_cat, left_on='variable', right_on='cat')\
              .groupby('item')['charac'].agg(['count','sum']).reset_index()\
              .rename(columns={'count':'cat_nb','sum':'sum_charac'})

Output:

  item  cat_nb  sum_charac
0    A       3          45
1    B       4          70
2    C       1          10
3    D       1          10
4    E       3          60
5    F       2          45
6    G       2          40

how to do a vlookup in pandas with counts

Tags:

merge

pandas

count

pivot

vlookup

Matthieu Veron

1 Answers

Scott Boston

Recent Activity

Donate For Us

how to do a vlookup in pandas with counts

Tags:

merge

pandas

count

pivot

vlookup

Matthieu Veron

1 Answers

Scott Boston

Related questions

Recent Activity

Donate For Us