Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Decode an UTF8 email header

Tags:

I have an email subject of the form:

=?utf-8?B?T3.....?= 

The body of the email is utf-8 base64 encoded - and has decoded fine. I am current using Perl's Email::MIME module to decode the email.

What is the meaning of the =?utf-8 delimiter and how do I extract information from this string?

like image 502
CoffeeMonster Avatar asked Sep 24 '08 08:09

CoffeeMonster


People also ask

How do I decode a UTF 8 string in Python?

To decode a string encoded in UTF-8 format, we can use the decode() method specified on strings. This method accepts two arguments, encoding and error . encoding accepts the encoding of the string to be decoded, and error decides how to handle errors that arise during decoding.


2 Answers

The encoded-word tokens (as per RFC 2047) can occur in values of some headers. They are parsed as follows:

=?<charset>?<encoding>?<data>?= 

Charset is UTF-8 in this case, the encoding is B which means base64 (the other option is Q which means Quoted Printable).

To read it, first decode the base64, then treat it as UTF-8 characters.

Also read the various Internet Mail RFCs for more detail, mainly RFC 2047.

Since you are using Perl, Encode::MIME::Header could be of use:

SYNOPSIS

use Encode qw/encode decode/; $utf8   = decode('MIME-Header', $header); $header = encode('MIME-Header', $utf8); 

ABSTRACT

This module implements RFC 2047 Mime Header Encoding. There are 3 variant encoding names; MIME-Header, MIME-B and MIME-Q. The difference is described below

              decode()          encode()   MIME-Header   Both B and Q      =?UTF-8?B?....?=   MIME-B        B only; Q croaks  =?UTF-8?B?....?=   MIME-Q        Q only; B croaks  =?UTF-8?Q?....?= 
like image 92
1800 INFORMATION Avatar answered Sep 16 '22 14:09

1800 INFORMATION


I think that the Encode module handles that with the MIME-Header encoding, so try this:

use Encode qw(decode); my $decoded = decode("MIME-Header", $encoded); 
like image 32
moritz Avatar answered Sep 20 '22 14:09

moritz