Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF-8 character encoding for email headers with PHP

I'm trying to encode a → (Right arrow, → or unicode 2192 hex) into an email subject line.

When I use php's mb_encode_mimeheader() I get a different value to when I do the same thing with Thunderbird or Gmail. But when the php-generated email arrives, the character is not properly displayed. Also, PHP's mb_decode_mimeheader() works on the output from PHP, but not to decode content from the other email sources.

By way of a hex dump, I've worked out that a UTF-8 representation of the arrow is

<?php
$rarr = "\xe2\x86\x92";

mb_encode_mimeheader($rarr, 'UTF-8'); //     =?UTF-8?B?w6LChsKS?=
// whereas Tbird and Gmail produce:          =?UTF-8?B?4oaS?=
// and more manually:
'=?UTF-8?B?' . base64_encode($rarr).'?='; // =?UTF-8?B?4oaS?=

PHP's encoding comes out in Thunderbird and Gmail as: â

I am completely confused by PHP's behaviour as it does not appear to be producing standard results.

How can I get PHP to encode a UTF-8 email header value so that it will be properly decoded by mail clients?

like image 358
artfulrobot Avatar asked Feb 18 '23 11:02

artfulrobot


1 Answers

Seems there is a bug that ignores the second parameter, I get the correct result when I add internal encoding:

<?php
$rarr = "\xe2\x86\x92";
mb_internal_encoding( "UTF-8");
echo mb_encode_mimeheader($rarr, 'UTF-8'); //=?UTF-8?B?4oaS?=

But

<?php
$rarr = "\xe2\x86\x92";

mb_encode_mimeheader($rarr, 'UTF-8'); //=?UTF-8?B?w6LChsKS?=

Just setting internal encoding is enough:

<?php
$rarr = "\xe2\x86\x92";
mb_internal_encoding( "UTF-8");
echo mb_encode_mimeheader($rarr); //=?UTF-8?B?4oaS?=
like image 60
Esailija Avatar answered Feb 21 '23 03:02

Esailija