How can I melt a pandas data frame using multiple variable names and values? I have the following data frame that changes its shape in a for loop. In one of the for loop iterations, it looks like this:
ID  Cat    Class_A   Class_B     Prob_A     Prob_B
1   Veg      1        2          0.9         0.1
2   Veg      1        2          0.8         0.2
3   Meat     1        2          0.6         0.4
4   Meat     1        2          0.3         0.7
5   Veg      1        2          0.2         0.8
I need to melt it in such a way that it looks like this:
ID  Cat    Class     Prob    
1   Veg      1       0.9       
1   Veg      2       0.1
2   Veg      1       0.8        
2   Veg      2       0.2
3   Meat     1       0.6         
3   Meat     2       0.4
4   Meat     1       0.3         
4   Meat     2       0.7
5   Veg      1       0.2         
5   Veg      2       0.8
During the for loop the data frame will contain different number of classes with their probabilities. That is why I am looking for a general approach that is applicable in all my for loop iterations. I saw this question and this but they were not helpful!
You need lreshape by dict for specify categories:
d = {'Class':['Class_A', 'Class_B'], 'Prob':['Prob_A','Prob_B']}
df = pd.lreshape(df,d)
print (df)
    Cat  ID  Class  Prob
0   Veg   1      1   0.9
1   Veg   2      1   0.8
2  Meat   3      1   0.6
3  Meat   4      1   0.3
4   Veg   5      1   0.2
5   Veg   1      2   0.1
6   Veg   2      2   0.2
7  Meat   3      2   0.4
8  Meat   4      2   0.7
9   Veg   5      2   0.8
More dynamic solution:
Class = [col for col in df.columns if col.startswith('Class')]
Prob = [col for col in df.columns if col.startswith('Prob')]
df = pd.lreshape(df, {'Class':Class, 'Prob':Prob})
print (df)
    Cat  ID  Class  Prob
0   Veg   1      1   0.9
1   Veg   2      1   0.8
2  Meat   3      1   0.6
3  Meat   4      1   0.3
4   Veg   5      1   0.2
5   Veg   1      2   0.1
6   Veg   2      2   0.2
7  Meat   3      2   0.4
8  Meat   4      2   0.7
9   Veg   5      2   0.8
EDIT:
lreshape is now undocumented, but is possible in future will by removed (with pd.wide_to_long too). 
Possible solution is merging all 3 functions to one - maybe melt, but now it is not implementated. Maybe in some new version of pandas. Then my answer will be updated.
Or you can try this by using str.contain and pd.concat 
DF1=df2.loc[:,df2.columns.str.contains('_A|Cat|ID')]
name=['ID','Cat','Class','Prob']
DF1.columns=name
DF2=df2.loc[:,df2.columns.str.contains('_B|Cat|ID')]
DF2.columns=name
pd.concat([DF1,DF2],axis=0)
Out[354]: 
   ID   Cat  Class  Prob
0   1   Veg      1   0.9
1   2   Veg      1   0.8
2   3  Meat      1   0.6
3   4  Meat      1   0.3
4   5   Veg      1   0.2
0   1   Veg      2   0.1
1   2   Veg      2   0.2
2   3  Meat      2   0.4
3   4  Meat      2   0.7
4   5   Veg      2   0.8
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With