Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove punctuations in pandas [duplicate]

Tags:

code: df['review'].head()
        index         review
output: 0      These flannel wipes are OK, but in my opinion

I want to remove punctuations from the column of the dataframe and create a new column.

code: import string 
      def remove_punctuations(text):
          return text.translate(None,string.punctuation)

      df["new_column"] = df['review'].apply(remove_punctuations)

Error:
  return text.translate(None,string.punctuation)
  AttributeError: 'float' object has no attribute 'translate'

I am using python 2.7. Any suggestions would be helpful.

like image 367
data_person Avatar asked Sep 30 '16 01:09

data_person


People also ask

How do you remove punctuation marks in Python?

One of the easiest ways to remove punctuation from a string in Python is to use the str. translate() method. The translate() method typically takes a translation table, which we'll do using the . maketrans() method.

How do I remove punctuation from a csv file in Python?

Ways to Remove Punctuation Marks from a String in PythonUsing the Regex. By using the translate() method. Using the join() method.

How do I strip text in pandas?

strip() function is used to remove leading and trailing characters. Strip whitespaces (including newlines) or a set of specified characters from each string in the Series/Index from left and right sides. Equivalent to str. strip().


2 Answers

Using Pandas str.replace and regex:

df["new_column"] = df['review'].str.replace('[^\w\s]','')
like image 68
Bob Haffner Avatar answered Oct 09 '22 17:10

Bob Haffner


You can build a regex using the string module's punctuation list:

df['review'].str.replace('[{}]'.format(string.punctuation), '')
like image 25
David C Avatar answered Oct 09 '22 16:10

David C