I have a dataframe like below.
user cat  val
u1   cat1 1
u1   cat2 2
u1   cat3 3
u2   cat1 5
u3   cat4 4
And wish to transform it like this.
user cat  val(avg)
u1   cat1 1
u1   cat2 2
u1   cat3 3
u1   cat4 4
u2   cat1 5
u2   cat2 2
u2   cat3 3
u2   cat4 4
u3   cat1 3
u3   cat2 2
u3   cat3 3
u3   cat4 4
There are also a couple of numeric columns to the right of cat, which hopefully can be filled with either NA; or if possible averaged out.
one way to solve this,
l1=df['user'].unique().tolist()
l2=df['cat'].unique().tolist()
new_df =  pd.DataFrame(list(itertools.product(l1,l2))).rename(columns={0:'user',1:'cat'})
new_df=pd.merge(new_df,df,on=['user','cat'],how='left')
Output:
   user   cat  val
0    u1  cat1  1.0
1    u1  cat2  2.0
2    u1  cat3  3.0
3    u1  cat4  NaN
4    u2  cat1  5.0
5    u2  cat2  NaN
6    u2  cat3  NaN
7    u2  cat4  NaN
8    u3  cat1  NaN
9    u3  cat2  NaN
10   u3  cat3  NaN
11   u3  cat4  4.0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With