While I was trying to write a python program that converts Ansi to UTF-8, I found this
https://stackoverflow.com/questions/14732996/how-can-i-convert-utf-8-to-ansi-in-python
which converts UTF-8 to Ansi.
I thought it will just work by reversing the order. So I coded
file_path_ansi = "input.txt"
file_path_utf8 = "output.txt"
#open and encode the original content
file_source = open(file_path_ansi, mode='r', encoding='latin-1', errors='ignore')
file_content = file_source.read()
file_source.close
#write
file_target = open(file_path_utf8, mode='w', encoding='utf-8')
file_target.write(file_content)
file_target.close
But it causes error.
TypeError: file<> takes at most 3 arguments <4 given>
So I changed
file_source = open(file_path_ansi, mode='r', encoding='latin-1', errors='ignore')
to
file_source = open(file_path_ansi, mode='r', encoding='latin-1')
Then it causes another error:
TypeError: 'encoding' is an invalid keyword arguemtn for this function
How should I fix my code to solve this problem?
You are trying to use the Python 3 version of the open()
function, on Python 2. Between the major versions, I/O support was overhauled, supporting better encoding and decoding.
You can get the same new version in Python 2 as io.open()
instead.
I'd use the shutil.copyfileobj()
function to do the copying, so you don't have to read the whole file into memory:
import io
import shutil
with io.open(file_path_ansi, encoding='latin-1', errors='ignore') as source:
with io.open(file_path_utf8, mode='w', encoding='utf-8') as target:
shutil.copyfileobj(source, target)
Be careful though; most people talking about ANSI refer to one of the Windows codepages; you may really have a file in CP (codepage) 1252, which is almost, but not quite the same thing as ISO-8859-1 (Latin 1). If so, use cp1252
instead of latin-1
as the encoding
parameter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With