How to use the split function on every row in a dataframe in Python?

Tags:

I want to count the number of times a word is being repeated in the review string

I am reading the csv file and storing it in a python dataframe using the below line

reviews = pd.read_csv("amazon_baby.csv")

The code in the below lines work when I apply it to a single review.

print reviews["review"][1] a = reviews["review"][1].split("disappointed") print a b = len(a) print b

The output for the above lines were

it came early and was not disappointed. i love planet wise bags and now my wipe holder. it keps my osocozy wipes moist and does not leak. highly recommend it. ['it came early and was not ', '. i love planet wise bags and now my wipe holder. it keps my osocozy wipes moist and does not leak. highly recommend it.'] 2

When I apply the same logic to the entire dataframe using the below line. I receive an error message

reviews['disappointed'] = len(reviews["review"].split("disappointed"))-1

Error message:

Traceback (most recent call last):   File "C:/Users/gouta/PycharmProjects/MLCourse1/Classifier.py", line 12, in <module>     reviews['disappointed'] = len(reviews["review"].split("disappointed"))-1   File "C:\Users\gouta\Anaconda2\lib\site-packages\pandas\core\generic.py", line 2360, in __getattr__     (type(self).__name__, name)) AttributeError: 'Series' object has no attribute 'split'

250

asked Mar 19 '16 23:03

goutam

Video Answer

1 Answers

You're trying to split the entire review column of the data frame (which is the Series mentioned in the error message). What you want to do is apply a function to each row of the data frame, which you can do by calling apply on the data frame:

f = lambda x: len(x["review"].split("disappointed")) -1 reviews["disappointed"] = reviews.apply(f, axis=1)

answered Oct 05 '22 14:10

hoyland

Related questions
                            
                                Debug view hierarchy in Xcode 7.3 fails
                            
                                Ifelse statement in R with multiple conditions
                            
                                How to create a series of numbers using Pandas in Python
                            
                                How to execute a command in a Jenkins 2.0 Pipeline job and then return the stdout
                            
                                java.lang.SecurityException: Permission Denial: reading com.android.providers.media.MediaProvider in Android while taking picture from gallery
                            
                                Generate Unique Hash code based on String
                            
                                Error while installing lxml through pip: Microsoft Visual C++ 14.0 is required
                            
                                How to get values which are in TextInput when Button is click in ReactNative
                            
                                Why does the container created with - 'docker run -d alpine sleep infinity' goes into exited/stopped state?
                            
                                Use of undeclared type UNAuthorizationOptions
                            
                                How to move a file to another folder in VS Code?
                            
                                How to solve prcomp.default(): cannot rescale a constant/zero column to unit variance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With