How can I use value_counts() only for certain values?

Tags:

I want to extract how many positive reviews by brand are in a dataset which includes reviews from thousands of products. I used this code and I got a table including percentaje of positive and non-positive reviews. How can I get only the percentage of positive reviews by brand? I only want the "True" results in positive_review. Thanks!

df_reviews_ok.groupby("brand")["positive_review"].value_counts(normalize=True).mul(100).round(2)
brand                   positive_review
Belkin                  False               70.00
                        True                30.00
Bowers & Wilkins        False               67.65
                        True                32.35
Corsair                 False               75.22
                        True                24.78
Definitive Technology   False               68.29
                        True                31.71
Dell                    False               60.87
                        True                39.13
DreamWave               False              100.00
House of Marley         False              100.00
JBL                     False               58.43
                        True                41.57
Kicker                  True                66.67
                        False               33.33
Lenovo                  False               76.92
                        True                23.08
Logitech                False               75.75
                        True                24.25
MEE audio               False               53.80
                        True                46.20
Microsoft               False               67.86
                        True                32.14
Midland                 False               72.09
                        True                27.91
Motorola                False               72.92
                        True                27.08
Netgear                 False               72.30
                        True                27.70
Pny                     False               68.78
                        True                31.22
Power Acoustik          False              100.00
SVS                     False              100.00
Samsung                 False               61.94
                        True                38.06
Sanus                   False               75.93
                        True                24.07
Sdi Technologies, Inc.  False               55.63
                        True                44.37
Siriusxm                False               73.33
                        True                26.67
Sling Media             False               67.16
                        True                32.84
Sony                    False               55.40
                        True                44.60
Toshiba                 False               56.52
                        True                43.48
Ultimate Ears           False               70.21
                        True                29.79
Verizon Wireless        False               75.86
                        True                24.14
WD                      False               58.33
                        True                41.67
Yamaha                  False               61.15
                        True                38.85
Name: positive_review, dtype: float64

752

asked Nov 22 '21 17:11

Mario Poveda

2 Answers

Using the following toy DataFrame as an example:

df = pd.DataFrame({
    'brand': list('AAAABBBB'),
    'positive': [True, True, False, False, True, True, True, False]
})

If you would like to get the percentage of positive reviews for each brand relative to the total number of reviews per brand then try:

df.groupby('brand')['positive'].mean()

The result is as expected:

brand
A    0.50
B    0.75
Name: positive, dtype: float64

179

answered Oct 19 '22 20:10

rudolfovic

You can unstack the output and slice the True

(df.groupby('brand')
   ['positive_review'].value_counts(normalize=True)
   .mul(100).round(2)
   .unstack(fill_value=0)
   [True]
 )

answered Oct 19 '22 21:10

mozway

Related questions
                            
                                What is the Sobel operator?
                            
                                In Pandas with Groupby: assign a value from a column conditioned on another column
                            
                                Drop all rows that have all NA values after last row that is not NA
                            
                                Building ML classifier with imbalanced data
                            
                                yfinance not working - receiving json.decoder.JSONDecodeError
                            
                                Django admin, page not found in custom view
                            
                                AttributeError: dlsym(RTLD_DEFAULT, AttachDebuggerTracing): symbol not found
                            
                                Using decorators of optional dependency
                            
                                Can anyone please explain why set is behaving like this with boolean in it? [duplicate]
                            
                                How to parse datetime that is coming in Arabic text (٠٤-٢٥-٢٠٢١) to English dates in Pyspark
                            
                                Split a string in pandas row and insert new rows by enlarging the dataframe
                            
                                Pandas counting the number of group elements excluding the focal element
                            
                                divide group data base on select columns values?
                            
                                Pandas DataFrame to Excel cell alignment
                            
                                Efficient way to extract data from NETCDF files
                            
                                Prompting "ImportError: No module named py27_urlquote" when running dev_appserver.py on Google Cloud SDK
                            
                                How to type-hint / type-check a dictionary (at runtime) for an arbitrary number of arbitrary key/value pairs?
                            
                                Django REST API accept list instead of dictionary in post request
                            
                                How to find the number of neighbours pixels in binary array
                            
                                Efficient way to map 3D function to a meshgrid with NumPy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I use value_counts() only for certain values?

Tags:

python

pandas

data-manipulation

Mario Poveda

People also ask

2 Answers

rudolfovic

mozway

Recent Activity

Donate For Us