Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a new dataframe from existing dataframes?

I have the following 2 dataframes:

df1

product_ID         tags
100         chocolate, sprinkles
101         chocolate, filled
102         glazed

df2

customer   product_ID
A            100
A            101
B            101
C            100
C            102
B            101
A            100
C            102

I should be able to create a new dataframe like this.

| customer | chocolate | sprinkles | filled | glazed |
|----------|-----------|-----------|--------|--------|
| A        | ?         | ?         | ?      | ?      |
| B        | ?         | ?         | ?      | ?      |
| C        | ?         | ?         | ?      | ?      |

Where the contents of cells represent the count of occurrences of product attribute.

I've used merge and got the following result

df3 = pd.merge(df2, df1)
df3.drop(['product'], axis = 1)

customer       tags
A        chocolate, sprinkles
C        chocolate, sprinkles
A        chocolate, sprinkles
A        chocolate, filled
B        chocolate, filled
B        chocolate, filled
C        glazed
C        glazed

How do we get to the final result from here? Thanks in advance!

like image 497
uharsha33 Avatar asked Jun 01 '26 05:06

uharsha33


1 Answers

Using get_dummies

df.set_index('customer').tags.str.get_dummies(sep=',').sum(level=0)
Out[593]: 
          chocolate  filled  glazed  sprinkles
customer                                      
A                 3       1       0          2
C                 1       0       2          1
B                 2       2       0          0
like image 103
BENY Avatar answered Jun 04 '26 08:06

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!