I have an email subject of the form:
=?utf-8?B?T3.....?=
The body of the email is utf-8 base64 encoded - and has decoded fine. I am current using Perl's Email::MIME module to decode the email.
What is the meaning of the =?utf-8 delimiter and how do I extract information from this string?
To decode a string encoded in UTF-8 format, we can use the decode() method specified on strings. This method accepts two arguments, encoding and error . encoding accepts the encoding of the string to be decoded, and error decides how to handle errors that arise during decoding.
The encoded-word
tokens (as per RFC 2047) can occur in values of some headers. They are parsed as follows:
=?<charset>?<encoding>?<data>?=
Charset is UTF-8 in this case, the encoding is B
which means base64 (the other option is Q
which means Quoted Printable).
To read it, first decode the base64, then treat it as UTF-8 characters.
Also read the various Internet Mail RFCs for more detail, mainly RFC 2047.
Since you are using Perl, Encode::MIME::Header could be of use:
SYNOPSIS
use Encode qw/encode decode/; $utf8 = decode('MIME-Header', $header); $header = encode('MIME-Header', $utf8);
ABSTRACT
This module implements RFC 2047 Mime Header Encoding. There are 3 variant encoding names; MIME-Header, MIME-B and MIME-Q. The difference is described below
decode() encode() MIME-Header Both B and Q =?UTF-8?B?....?= MIME-B B only; Q croaks =?UTF-8?B?....?= MIME-Q Q only; B croaks =?UTF-8?Q?....?=
I think that the Encode module handles that with the MIME-Header
encoding, so try this:
use Encode qw(decode); my $decoded = decode("MIME-Header", $encoded);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With