Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python script to convert from UTF-8 to ASCII [duplicate]

I'm trying to write a script in python to convert utf-8 files into ASCII files:

#!/usr/bin/env python
# *-* coding: iso-8859-1 *-*

import sys
import os

filePath = "test.lrc"
fichier = open(filePath, "rb")
contentOfFile = fichier.read()
fichier.close()

fichierTemp = open("tempASCII", "w")
fichierTemp.write(contentOfFile.encode("ASCII", 'ignore'))
fichierTemp.close()

When I run this script I have the following error :

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 13: ordinal not in range(128)

I thought that can ignore error with the ignore parameter in the encode method. But it seems not.

I'm open to other ways to convert.

like image 546
Nicolas Avatar asked Nov 28 '10 23:11

Nicolas


3 Answers

data="UTF-8 DATA"
udata=data.decode("utf-8")
asciidata=udata.encode("ascii","ignore")
like image 107
Utku Zihnioglu Avatar answered Nov 11 '22 07:11

Utku Zihnioglu


import codecs

 ...

fichier = codecs.open(filePath, "r", encoding="utf-8")

 ...

fichierTemp = codecs.open("tempASCII", "w", encoding="ascii", errors="ignore")
fichierTemp.write(contentOfFile)

 ...
like image 24
Ignacio Vazquez-Abrams Avatar answered Nov 11 '22 06:11

Ignacio Vazquez-Abrams


UTF-8 is a superset of ASCII. Either your UTF-8 file is ASCII, or it can't be converted without loss.

like image 6
Tobu Avatar answered Nov 11 '22 05:11

Tobu