Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: add new column with count how often the highest score of a day was reached by this person

It is pandas/Dataframe, it contains all scores for everyone everyday, I want to add one extra column to collect how many time it has the highest score (could be more than one people and some data are nan)

import pandas as pd
import numpy as np

data = np.array([['','day1','day2','day3','day4','day5'],
                ['larry',1,4,7,3,5],
                ['niko',2,-1,3,6,4],
                ['tin',np.nan,5,5, 6,7]])
                
df = pd.DataFrame(data=data[1:,1:],
                  index=data[1:,0],
                  columns=data[0,1:])
print(df)

output

      day1 day2 day3 day4 day5
larry    1    4    7    3    5
niko     2   -1    3    6    4
tin    nan    5    5    6    7

expected result is (larry: 1 time, niko: 2 times, tin: 3 times)

      times_of_top day1 day2 day3 day4 day5
larry            1    1    4    7    3    5
niko             2    2   -1    3    6    4
tin              3  nan    5    5    6    7

niko has the highest score on day1 and day4 so his times_of_top is 2.
tin has the highest score on day2, day4 and day5 so his times_of_top is 3.

like image 220
Larry Cai Avatar asked Feb 06 '21 10:02

Larry Cai


People also ask

How do you count the number of occurrences in a column in Pandas?

Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.

How do you find the most frequent value in a column in Python?

To sum the number of times an element or number appears, Python's value_counts() function is used. The mode() method can then be used to get the most often occurring element.

How do you count the frequency of a value in Pandas?

In pandas you can get the count of the frequency of a value that occurs in a DataFrame column by using Series. value_counts() method, alternatively, If you have a SQL background you can also get using groupby() and count() method.


1 Answers

One way using pandas.DataFrame.stack and count:

# df = df.astype(float)
# Since the sample data are in object type

df["times_of_top"] = df[df == df.max()].stack().count(0)
print(df)

Output:

       day1  day2  day3  day4  day5  times_of_top
larry   1.0   4.0   7.0   3.0   5.0             1
niko    2.0  -1.0   3.0   6.0   4.0             2
tin     NaN   5.0   5.0   6.0   7.0             3
like image 142
Chris Avatar answered Oct 18 '22 14:10

Chris