Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove punctuation marks from a string in Python 3.x using .translate()?

I want to remove all punctuation marks from a text file using .translate() method. It seems to work well under Python 2.x but under Python 3.4 it doesn't seem to do anything.

My code is as follows and the output is the same as input text.

import string fhand = open("Hemingway.txt") for fline in fhand:     fline = fline.rstrip()     print(fline.translate(string.punctuation)) 
like image 998
cybujan Avatar asked Dec 15 '15 16:12

cybujan


People also ask

How do you remove punctuation from a string in Python 3?

Use Python to Remove Punctuation from a String with Translate. One of the easiest ways to remove punctuation from a string in Python is to use the str. translate() method. The translate method typically takes a translation table, which we'll do using the .

How do I translate in Python 3?

Python 3 String translate() Method The translate() method returns a copy of the string in which all characters have been translated using table (constructed with the maketrans() function in the string module), optionally deleting all characters found in the string deletechars.

How do I remove the punctuation from a list in a string Python?

The fastest and the most efficient way to remove punctuations from a list of strings in Python is the str. translate() function.

How do I remove the punctuation from a string in a DataFrame Python?

To remove punctuation with Python Pandas, we can use the DataFrame's str. replace method. We call replace with a regex string that matches all punctuation characters and replace them with empty strings. replace returns a new DataFrame column and we assign that to df['text'] .


2 Answers

You have to create a translation table using maketrans that you pass to the str.translate method.

In Python 3.1 and newer, maketrans is now a static-method on the str type, so you can use it to create a translation of each punctuation you want to None.

import string  # Thanks to Martijn Pieters for this improved version  # This uses the 3-argument version of str.maketrans # with arguments (x, y, z) where 'x' and 'y' # must be equal-length strings and characters in 'x' # are replaced by characters in 'y'. 'z' # is a string (string.punctuation here) # where each character in the string is mapped # to None translator = str.maketrans('', '', string.punctuation)  # This is an alternative that creates a dictionary mapping # of every character from string.punctuation to None (this will # also work) #translator = str.maketrans(dict.fromkeys(string.punctuation))  s = 'string with "punctuation" inside of it! Does this work? I hope so.'  # pass the translator to the string's translate method. print(s.translate(translator)) 

This should output:

string with punctuation inside of it Does this work I hope so 
like image 117
逆さま Avatar answered Sep 23 '22 02:09

逆さま


The call signature of str.translate has changed and apparently the parameter deletechars has been removed. You could use

import re fline = re.sub('['+string.punctuation+']', '', fline) 

instead, or create a table as shown in the other answer.

like image 39
elzell Avatar answered Sep 22 '22 02:09

elzell