Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Melt multiple columns pandas dataframe based on criteria

I have a pandas dataframe as follows:

dataframe = pd.DataFrame(
    {
    'ID': [1,2,3,4],
    'Gender': ['F','F','M','M'],
    'Language': ['EN', 'ES', 'EN', 'EN'],
    'Year 1': [2020,2020,2020,2020],
    'Score 1': [93,97,83,86],
    'Year 2': [2020,2020,None,2018],
    'Score 2': [85,95,None,55],
    'Year 3': [2020,2018,None,None],
    'Score 3': [87,86,None,None]
    }
)
ID Gender Language Year 1 Score 1 Year 2 Score 2 Year 3 Score 3
1 F EN 2020 93 2020 85 2020 87
2 F ES 2020 97 2020 95 2018 86
3 M EN 2020 83
4 M EN 2020 86 2018 55

And I would like to melt based on the year and the corresponding scores, for example if any year equals 2020 then the following would be generated:

ID Gender Language Year Score
1 F EN 2020 93
1 F EN 2020 85
1 F EN 2020 87
2 F ES 2020 97
2 F ES 2020 95
3 M EN 2020 83
4 M EN 2020 86

I have tried using pd.melt but am having trouble filtering by the year across the columns and keeping the corresponding entries.

like image 465
josh453 Avatar asked Sep 15 '25 15:09

josh453


1 Answers

From what i understand, you may try:

out = (pd.wide_to_long(dataframe,['Year','Score'],['ID','Gender','Language'],'v',' ')
                                               .dropna().droplevel(-1).reset_index())

print(out)

   ID Gender Language    Year  Score
0   1      F       EN  2020.0   93.0
1   1      F       EN  2020.0   85.0
2   1      F       EN  2020.0   87.0
3   2      F       ES  2020.0   97.0
4   2      F       ES  2020.0   95.0
5   2      F       ES  2018.0   86.0
6   3      M       EN  2020.0   83.0
7   4      M       EN  2020.0   86.0
8   4      M       EN  2018.0   55.0
like image 123
anky Avatar answered Sep 18 '25 05:09

anky