Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to solve binary mode doesn't take an encoding argument

Tags:

python

nltk

Code:

import nltk
eng_lish= open("C:/Users/Nouros/Desktop/Thesis/english.csv","rb", encoding='utf8').read()
bang_lish= open("C:/Users/Nouros/Desktop/Thesis/banglish.csv","rb", encoding='utf8').read()

Problem:

Traceback (most recent call last):
File "C:/Users/Nouros/Desktop/Thesis/nltk_run_copy.py", line 3, in <module>
    eng_lish= open("C:/Users/Nouros/Desktop/Thesis/english.csv","rb",encoding="utf-8")
ValueError: binary mode doesn't take an encoding argument
like image 250
Nouros Avatar asked Feb 16 '18 16:02

Nouros


2 Answers

you're reading csv files, which are text files. So you need encoding but not binary mode.

So you should not use rb to open them (it is advised to do so when using csv module in Python 2, but it's irrelevant in other contexts).

Just use plain text mode:

open("C:/Users/Nouros/Desktop/Thesis/english.csv","r", encoding='utf8').read()

Me I would prefer using csv module, to avoid manual split of lines & cols:

import csv
with open(r"C:\Users\Nouros\Desktop\Thesis\english.csv","r", encoding='utf8') as f:
     cr = csv.reader(f,delimiter=",") # , is default
     rows = list(cr)  # create a list of rows for instance

(note that csv module recommends using newline="" when opening files for reading in Python 3, but the issues are actually when writing files)

like image 149
Jean-François Fabre Avatar answered Oct 14 '22 03:10

Jean-François Fabre


Binary mode by definition does not require an encoding because you are reading individual bytes. Encoding is only relevant when you want to read text. Different encodings treat the binary data differently. For some encodings a single byte represents a character. For others, a character may be multiple bytes. This is the whole purpose of encoding: to represent text data as characters.

like image 36
Code-Apprentice Avatar answered Oct 14 '22 03:10

Code-Apprentice