Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Bag of Words NameError: name 'unicode' is not defined

I have been following this site, https://radimrehurek.com/data_science_python/, to apply bag of words on a list of tweets.

import csv
from textblob import TextBlob
import pandas

messages = pandas.read_csv('C:/Users/Suki/Project/Project12/newData1.csv', sep='\t', quoting=csv.QUOTE_NONE,
                               names=["label", "message"])

def split_into_tokens(message):
    message = unicode(message, encoding="utf8")  # convert bytes into proper unicode
    return TextBlob(message).words

messages.message.head().apply(split_into_tokens)

print (messages)

However I keep getting this error. I've checked and I following the code on the site but the error keeps arising.

Error

Traceback (most recent call last):
  File "C:/Users/Suki/Project/Project12/projectBagofWords.py", line 34, in <module>
    messages.message.head().apply(split_into_tokens)
  File "C:\Program Files\Python36\lib\site-packages\pandas\core\series.py", line 2510, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/src\inference.pyx", line 1521, in pandas._libs.lib.map_infer
  File "C:/Users/Suki/Project/Project12/projectBagofWords.py", line 31, in split_into_tokens
    message = unicode(message, encoding="utf8")  # convert bytes into proper unicode
NameError: name 'unicode' is not defined

Can someone offer advice on how I could rectify this?

Thanks

like image 232
Estra Avatar asked Jan 29 '23 02:01

Estra


1 Answers

unicode is a python 2 method. If you are not sure which version will run this code, you can simply add this at the beginning of your code so it will replace the old unicode with new str:

import sys
if sys.version_info[0] >= 3:
    unicode = str
like image 196
Mehdi Avatar answered Jan 31 '23 21:01

Mehdi