I want to extract how many positive reviews by brand are in a dataset which includes reviews from thousands of products. I used this code and I got a table including percentaje of positive and non-positive reviews. How can I get only the percentage of positive reviews by brand? I only want the "True" results in positive_review. Thanks!
df_reviews_ok.groupby("brand")["positive_review"].value_counts(normalize=True).mul(100).round(2)
brand positive_review
Belkin False 70.00
True 30.00
Bowers & Wilkins False 67.65
True 32.35
Corsair False 75.22
True 24.78
Definitive Technology False 68.29
True 31.71
Dell False 60.87
True 39.13
DreamWave False 100.00
House of Marley False 100.00
JBL False 58.43
True 41.57
Kicker True 66.67
False 33.33
Lenovo False 76.92
True 23.08
Logitech False 75.75
True 24.25
MEE audio False 53.80
True 46.20
Microsoft False 67.86
True 32.14
Midland False 72.09
True 27.91
Motorola False 72.92
True 27.08
Netgear False 72.30
True 27.70
Pny False 68.78
True 31.22
Power Acoustik False 100.00
SVS False 100.00
Samsung False 61.94
True 38.06
Sanus False 75.93
True 24.07
Sdi Technologies, Inc. False 55.63
True 44.37
Siriusxm False 73.33
True 26.67
Sling Media False 67.16
True 32.84
Sony False 55.40
True 44.60
Toshiba False 56.52
True 43.48
Ultimate Ears False 70.21
True 29.79
Verizon Wireless False 75.86
True 24.14
WD False 58.33
True 41.67
Yamaha False 61.15
True 38.85
Name: positive_review, dtype: float64
Use Sum Function to Count Specific Values in a Column in a Dataframe. We can use the sum() function on a specified column to count values equal to a set condition, in this case we use == to get just rows equal to our specific data point. If we wanted to count specific values that match another boolean operation we can.
Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.
The count() is a built-in function in Python. It will return you the count of a given element in a list or a string. In the case of a list, the element to be counted needs to be given to the count() function, and it will return the count of the element. The count() method returns an integer value.
Using the following toy DataFrame
as an example:
df = pd.DataFrame({
'brand': list('AAAABBBB'),
'positive': [True, True, False, False, True, True, True, False]
})
If you would like to get the percentage of positive reviews for each brand relative to the total number of reviews per brand then try:
df.groupby('brand')['positive'].mean()
The result is as expected:
brand
A 0.50
B 0.75
Name: positive, dtype: float64
You can unstack
the output and slice the True
(df.groupby('brand')
['positive_review'].value_counts(normalize=True)
.mul(100).round(2)
.unstack(fill_value=0)
[True]
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With