Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

str.translate gives TypeError - Translate takes one argument (2 given), worked in Python 2

I have the following code

import nltk, os, json, csv, string, cPickle
from scipy.stats import scoreatpercentile

lmtzr = nltk.stem.wordnet.WordNetLemmatizer()

def sanitize(wordList): 
answer = [word.translate(None, string.punctuation) for word in wordList] 
answer = [lmtzr.lemmatize(word.lower()) for word in answer]
return answer

words = []
for filename in json_list:
    words.extend([sanitize(nltk.word_tokenize(' '.join([tweet['text'] 
                   for tweet in json.load(open(filename,READ))])))])

I've tested lines 2-4 in a separate testing.py file when I wrote

import nltk, os, json, csv, string, cPickle
from scipy.stats import scoreatpercentile

wordList= ['\'the', 'the', '"the']
print wordList
wordList2 = [word.translate(None, string.punctuation) for word in wordList]
print wordList2
answer = [lmtzr.lemmatize(word.lower()) for word in wordList2]
print answer

freq = nltk.FreqDist(wordList2)
print freq

and the command prompt returns ['the','the','the'], which is what I wanted (removing punctuation).

However, when I put the exact same code in a different file, python returns a TypeError stating that

File "foo.py", line 8, in <module>
  for tweet in json.load(open(filename, READ))])))])
File "foo.py", line 2, in sanitize
  answer = [word.translate(None, string.punctuation) for word in wordList]
TypeError: translate() takes exactly one argument (2 given)

json_list is a list of all the file paths (I printed and check that this list is valid). I'm confused on this TypeError because everything works perfectly fine when I'm just testing it in a different file.

like image 330
carebear Avatar asked Apr 19 '14 21:04

carebear


People also ask

How many arguments does Python-translate () take?

python - translate() takes exactly one argument (2 given) - Stack Overflow I want to write a python program to rename all the files from a folder so that I remove the numbers from file name, for example: chicago65.jpg will be renamed as chicago.jpg. Below is my code but ...

What is a possible duplicate of translate () in Python?

Possible duplicate of translate() takes exactly one argument (2 given) in python error – Nadim Hussami Aug 10 '17 at 15:16 Add a comment | 2 Answers 2

How to translate text text to none?

text = text.translate (str.maketrans ('','',string.punctuation)) text = text.translate (str.maketrans ('','','1234567890')) Basically it says 'translate nothing to nothing' (first two parameters) and translate any punctuation or numbers to None (i.e. remove them).

What is the difference between two and three arguments in Python?

If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.


2 Answers

If all you are looking to accomplish is to do the same thing you were doing in Python 2 in Python 3, here is what I was doing in Python 2.0 to throw away punctuation and numbers:

text = text.translate(None, string.punctuation)
text = text.translate(None, '1234567890')

Here is my Python 3.0 equivalent:

text = text.translate(str.maketrans('','',string.punctuation))
text = text.translate(str.maketrans('','','1234567890'))

Basically it says 'translate nothing to nothing' (first two parameters) and translate any punctuation or numbers to None (i.e. remove them).

like image 82
drchuck Avatar answered Oct 16 '22 14:10

drchuck


I suspect your issue has to do with the differences between str.translate and unicode.translate (these are also the differences between str.translate on Python 2 versus Python 3). I suspect your original code is being sent unicode instances while your test code is using regular 8-bit str instances.

I don't suggest converting Unicode strings back to regular str instances, since unicode is a much better type for handling text data (and it is the future!). Instead, you should just adapt to the new unicode.translate syntax. With regular str.translate (on Python 2), you can pass an optional deletechars argument and the characters in it would be removed from the string. For unicode.translate (and str.translate on Python 3), the extra argument is no longer allowed, but translation table entries with None as their value will be deleted from the output.

To solve the problem you'll need to create an appropriate translation table. A translation table is a dictionary mapping from Unicode ordinals (that is, ints) to ordinals, strings or None. A helper function for making them exists in Python 2 as string.maketrans (and Python 3 as a method of the str type), but the Python 2 version of it doesn't handle the case we care about (putting None values into the table). You can build an appropriate dictionary yourself with something like {ord(c): None for c in string.punctuation}.

like image 74
Blckknght Avatar answered Oct 16 '22 16:10

Blckknght