Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change "Quoted-printable" encoding to "utf-8"

I am trying to read email with imaplib. I get this mail body:

=C4=EE=E1=F0=FB=E9 =E4=E5=ED=FC!  

That is Quoted-printable encoding.
I need to get utf-8 from this. It should be Добрый день!

I googled it, but it is too messy with Python's versions. It is already unicode in Python 3, I cann't use .encode('utf-8') here.

How can I change this to utf-8?

like image 512
Qiao Avatar asked Jan 10 '13 01:01

Qiao


People also ask

How do you encode a quoted-printable?

The format of a quoted-printable message is simple. The encoder converts any character that must be escaped to an equal sign (=) followed by the character's ASCII value in hexadecimal. For example, a VT character (ASCII value 11) is represented as =0B and a DEL character (ASCII value 127) is represented as =7F.

Is UTF-8 the default encoding?

UTF-8 is the dominant encoding for the World Wide Web (and internet technologies), accounting for 98% of all web pages, and up to 100.0% for some languages, as of 2022.

What is content transfer encoding quoted-printable?

Quoted-printable encoding is used where data is mostly US-ASCII text. It allows for 8-bit characters to be represented as their hexadecimal values. For instance, a new line can be forced by using the following string: "=0D=0A".

What is quoted-printable string?

Quoted-Printable, or QP encoding, is a binary-to-text encoding system using printable ASCII characters (alphanumeric and the equals sign = ) to transmit 8-bit data over a 7-bit data path or, generally, over a medium which is not 8-bit clean.


1 Answers

The quopri module can convert those bytes to an unencoded byte stream. You need to then decode those from whatever character set they're in, then encode back to utf-8.

>>> b = quopri.decodestring('=C4=EE=E1=F0=FB=E9 =E4=E5=ED=FC')
>>> print(b.decode('windows-1251'))
Добрый день
like image 97
Mark Ransom Avatar answered Oct 21 '22 16:10

Mark Ransom