I have a dataset 'df' with 3 columns.
>> Original Data
    Student Id    Name  Marks
0       id_1    John    112
1       id_2    Rafs    181
2       id_2    Rafs    182
3       id_2    Rafs    183
4       id_3    Juan    222
5       id_3    Juan    312
6       id_3  Roller     21
Trying to keep the columns 'Student_Id' and 'Name' as it is but convert 'Marks' to multiple columns. Such that each unique 'Student_Id' and 'Name' will have a row of all possible Marks. Also we need not create columns manually but it should be dynamically created depending on the values.
>> Expected Output
    Student Id    Name  Marks1  Marks2  Marks3
0       id_1    John     112    <NA>    <NA>
1       id_2    Rafs     181     182     183
2       id_3    Juan     222     312    <NA>
3       id_3  Roller      21    <NA>    <NA>
Sample data to replicate the input
import pandas as pd
data = [
    ["id_1", 'John', 112],
    ["id_2", 'Rafs', 181],
    ["id_2", 'Rafs', 182],
    ["id_2", 'Rafs', 183], 
    ["id_3", 'Juan', 222],
    ["id_3", 'Juan', 312],
    ["id_3", 'Roller', 21]
]
df = pd.DataFrame(data, columns = ['Student Id', 'Name', 'Marks'])
I tried the below but I am not getting the desired output. It gives results in brackets() an also the Marks is missing.
df3 = df.pivot_table(index=['Student Id','Name'], columns='Marks', aggfunc = 'max')
>>Output
Empty DataFrame
Columns: []
Index: [(id_1, John), (id_2, Rafs), (id_3, Juan), (id_3, Roller)]
                Use GroupBy.cumcount for new column for counter columns created MultiIndex in df3:
df['g'] = df.groupby(['Student Id','Name']).cumcount().add(1)
df3 = (df.pivot_table(index=['Student Id','Name'], 
                     columns='g', 
                     values='Marks', 
                     aggfunc = 'max')
        .add_prefix('Marks')
        .rename_axis(None, axis=1)
        .reset_index())
print (df3)
  Student Id    Name  Marks1  Marks2  Marks3
0       id_1    John   112.0     NaN     NaN
1       id_2    Rafs   181.0   182.0   183.0
2       id_3    Juan   222.0   312.0     NaN
3       id_3  Roller    21.0     NaN     NaN
If need integers with missing values:
df['g'] = df.groupby(['Student Id','Name']).cumcount().add(1)
df3 = (df.pivot_table(index=['Student Id','Name'], 
                     columns='g', 
                     values='Marks', 
                     aggfunc = 'max')
        .add_prefix('Marks')
        .astype('Int64')
        .rename_axis(None, axis=1)
        .reset_index())
print (df3)
  Student Id    Name  Marks1  Marks2  Marks3
0       id_1    John     112    <NA>    <NA>
1       id_2    Rafs     181     182     183
2       id_3    Juan     222     312    <NA>
3       id_3  Roller      21    <NA>    <NA>
                        another way:
temp = df.groupby(["Student Id", "Name"]).Marks.agg(list)
out = (pd.DataFrame(temp.tolist(), index=temp.index)
           .rename(columns=lambda x: f"Marks{x+1}")
           .reset_index())
temp will be a dataframe with aggregated lists of Marks per (id, name). Then we form a dataframe out of it while renaming the columns to desired format and resetting the index to put id & name to columns back.
to get
  Student Id    Name  Marks1  Marks2  Marks3
0       id_1    John     112     NaN     NaN
1       id_2    Rafs     181   182.0   183.0
2       id_3    Juan     222   312.0     NaN
3       id_3  Roller      21     NaN     NaN
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With