Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding problem (UTF-8) in PHP

I want to output the following string in PHP:

ä ö ü ß €

Therefore, I've encoded it to utf8 manually:

ä ö ü ß €

So my script is:

<?php
header('content-type: text/html; charset=utf-8');
echo 'ä ö ü ß €';
?>

The first 4 characters are correct (ä ö ü ß) but unfortunately the € sign isn't correct:

ä ö ü ß

Here you can see it.

Can you tell me what I've done wrong? My editor (Notepad++) has settings for Encoding (Ansi/UTF-8) and Format (Windows/Unix). Do I have to change them?

I hope you can help me. Thanks in advance!

like image 313
caw Avatar asked Sep 07 '09 10:09

caw


People also ask

Does PHP support UTF-8?

The utf8_encode() function is an inbuilt function in PHP which is used to encode an ISO-8859-1 string to UTF-8. Unicode has been developed to describe all possible characters of all languages and includes a lot of symbols with one unique number for each symbol/character.

What is UTF-8 and what problem does it solve?

UTF-8 is a way of encoding Unicode so that an ASCII text file encodes to itself. No wasted space, beyond the initial bit of every byte ASCII doesn't use. And if your file is mostly ASCII text with a few non-ASCII characters sprinkled in, the non-ASCII characters just make your file a little longer.

What is UTF-8 PHP?

Definition and Usage. The utf8_encode() function encodes an ISO-8859-1 string to UTF-8. Unicode is a universal standard, and has been developed to describe all possible characters of all languages plus a lot of symbols with one unique number for each character/symbol.


2 Answers

That last character just isn't in the file (try viewing the source), which is why you don't see it.

I think you might be better off saving the PHP file as UTF-8 (in Notepad++ that options is available in Format -> Encode in UTF-8 without BOM), and inserting the actual characters in your PHP file (i.e. in Notepad++), rather than hacking around with inserting à everywhere. You may find Windows Character Map useful for inserting unicode characters.

like image 189
Dominic Rodger Avatar answered Sep 20 '22 22:09

Dominic Rodger


The Euro sign (U+20AC) is encoded in UTF-8 with three bytes, not two. This can be seen here. So your encoding is simply wrong.

like image 32
Joey Avatar answered Sep 24 '22 22:09

Joey