I have a dataframe that looks like this:
userId movieId rating
0 1 31 2.5
1 1 1029 3.0
2 1 3671 3.0
3 2 10 4.0
4 2 17 5.0
5 3 60 3.0
6 3 110 4.0
7 3 247 3.5
8 4 10 4.0
9 4 112 5.0
10 5 3 4.0
11 5 39 4.0
12 5 104 4.0
I need to get a dataframe which has unique userId, number of ratings by the user and the average rating by the user as shown below:
userId count mean
0 1 3 2.83
1 2 2 4.5
2 3 3 3.5
3 4 2 4.5
4 5 3 4.0
Can someone help?
Apply a function groupby to each row or column of a DataFrame. Groupby one column and return the mean of the remaining columns in each group. Groupby two columns and return the mean of the remaining column. Groupby one column and return the mean of only particular column in the group.
The most simple method for pandas groupby count is by using the in-built pandas method named size(). It returns a pandas series that possess the total number of row count for each group. The basic working of the size() method is the same as len() method and hence, it is not affected by NaN values in the dataset.
To get column average or mean from pandas DataFrame use either mean() and describe() method. The DataFrame. mean() method is used to return the mean of the values for the requested axis.
What is the GroupBy function? Pandas' GroupBy is a powerful and versatile function in Python. It allows you to split your data into separate groups to perform computations for better analysis.
df1 = df.groupby('userId')['rating'].agg(['count','mean']).reset_index()
print(df1)
userId count mean
0 1 3 2.833333
1 2 2 4.500000
2 3 3 3.500000
3 4 2 4.500000
4 5 3 4.000000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With