Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to use the split function on every row in a dataframe in Python?


I want to count the number of times a word is being repeated in the review string

I am reading the csv file and storing it in a python dataframe using the below line

reviews = pd.read_csv("amazon_baby.csv") 

The code in the below lines work when I apply it to a single review.

print reviews["review"][1] a = reviews["review"][1].split("disappointed") print a b = len(a) print b 

The output for the above lines were

it came early and was not disappointed. i love planet wise bags and now my wipe holder. it keps my osocozy wipes moist and does not leak. highly recommend it. ['it came early and was not ', '. i love planet wise bags and now my wipe holder. it keps my osocozy wipes moist and does not leak. highly recommend it.'] 2 

When I apply the same logic to the entire dataframe using the below line. I receive an error message

reviews['disappointed'] = len(reviews["review"].split("disappointed"))-1 

Error message:

Traceback (most recent call last):   File "C:/Users/gouta/PycharmProjects/MLCourse1/Classifier.py", line 12, in <module>     reviews['disappointed'] = len(reviews["review"].split("disappointed"))-1   File "C:\Users\gouta\Anaconda2\lib\site-packages\pandas\core\generic.py", line 2360, in __getattr__     (type(self).__name__, name)) AttributeError: 'Series' object has no attribute 'split' 
like image 250
goutam Avatar asked Mar 19 '16 23:03


People also ask

How do you split rows in a data frame?

Series and DataFrame methods define a . explode() method that explodes lists into separate rows. See the docs section on Exploding a list-like column. Since you have a list of comma separated strings, split the string on comma to get a list of elements, then call explode on that column.

How do you split a string into multiple rows in Python?

To split cell into multiple rows in a Python Pandas dataframe, we can use the apply method. to call apply with a lambda function that calls str. split to split the x string value. And then we call explode to fill new rows with the split values.

How do I split a column into multiple rows in Python?

To split text in a column into multiple rows with Python Pandas, we can use the str. split method. to create the df data frame.

How to split cell into multiple rows in a Python pandas Dataframe?

to call apply with a lambda function that calls str.split to split the x string value. And then we call explode to fill new rows with the split values. Finally, we call `reset_index to reset the index numbers after filling the rows with the split values. To split cell into multiple rows in a Python Pandas dataframe, we can use the apply method.

What does the split () function do in Python?

What it does is split or breakup a string and add the data to a string array using a defined separator. If no separator is defined when you call upon the function, whitespace will be used by default.

How do I split a data frame into multiple rows?

What you want to do is apply a function to each row of the data frame, which you can do by calling apply on the data frame: Show activity on this post. pandas 0.20.3 has pandas.Series.str.split () which acts on every string of the series and does the split.

How to split a string by Max in Python?

The default value of max is -1. In case the max parameter is not specified, the split () function splits the given string or the line whenever a separator is encountered Manipulation of strings is necessary for all of the programs dealing with strings. In such cases, you need to make use of a function called split () function in Python.

Video Answer

1 Answers

You're trying to split the entire review column of the data frame (which is the Series mentioned in the error message). What you want to do is apply a function to each row of the data frame, which you can do by calling apply on the data frame:

f = lambda x: len(x["review"].split("disappointed")) -1 reviews["disappointed"] = reviews.apply(f, axis=1) 
like image 93
hoyland Avatar answered Oct 05 '22 14:10
