I have a dataframe which looks like this.
id YearReleased Artist count
168 2015 Muse 1
169 2015 Rihanna 3
170 2015 Taylor Swift 2
171 2016 Jennifer Lopez 1
172 2016 Rihanna 3
173 2016 Underworld 1
174 2017 Coldplay 1
175 2017 Ed Sheeran 2
I want to get the maximum count for each year and then get the corresponding Artist name.
Something like this:
YearReleased Artist
2015 Rihanna
2016 Rihanna
2017 Ed Sheeran
I have tried using a loop to iterate over the rows of the dataframe and create another dictionary with key as year and value as artist. But when I try to convert that dictionary to a dataframe, the keys are mapped to columns instead of rows.
Can somebody guide me to have a better approach to this without having to loop over the dataframe and instead use some inbuilt pandas method to achieve this?
Use pandas. DataFrame. query() to get a column value based on another column. Besides this method, you can also use DataFrame.
The max() method returns a Series with the maximum value of each column. By specifying the column axis ( axis='columns' ), the max() method searches column-wise and returns the maximum value for each row.
How to perform groupby index in pandas? Pass index name of the DataFrame as a parameter to groupby() function to group rows on an index. DataFrame. groupby() function takes string or list as a param to specify the group columns or index.
Look at idxmax
df.loc[df.groupby('YearReleased')['count'].idxmax()]
Out[445]:
id YearReleased Artist count
1 169 2015 Rihanna 3
4 172 2016 Rihanna 3
7 175 2017 EdSheeran 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With