Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Php str_replace not working with special chars

why isn't this working as expected:

 echo str_replace("é","é","Fédération Camerounaise de Football");

result:

"Fédération Camerounaise de Football"

i'm expecting to have:

"Fédération Camerounaise de Football"
like image 740
Rachid O Avatar asked Aug 23 '14 18:08

Rachid O


2 Answers

You are doing it wrong. This string is not incorrect and in need of replacement, it is simply encoded with UTF-8.

All you have to do is utf8_decode('Fédération Camerounaise de Football').

Update:

You are seeing Fédération Camerounaise de Football as output because you are double passing your data in UTF-8.

Observe:

file1.php saved in UTF-8 format:

<?php
    echo "Fédération Camerounaise de Football";

Output:

Fédération Camerounaise de Football

Now, if you tell the browser you are using UTF-8, it should display the content straight:

file2.php saved in UTF-8 format:

<?php
    header('Content-Type: text/html; charset=utf-8');
    echo "Fédération Camerounaise de Football";

Output:

Fédération Camerounaise de Football

Perfect.

Howover, you are doing things even worse. You have an UTF-8 encoded string, and is encoding it again, by writing it to a UTF-8 encoded file.

file3.php saved in UTF-8 format:

<?php
    echo "Fédération Camerounaise de Football";

Output:

Fédération Camerounaise de Football

What a mess. Let's make it worse by seeing if we can fix this with str_replace:

file4.php saved in UTF-8 format:

<?php
    echo str_replace("é","é","Fédération Camerounaise de Football");

Output:

Fédération Camerounaise de Football

As you can see, we "fixed" it. Sort of. Thats what you are doing. You are transforming é into é, even though you are not seeing this because your editor won't let you see the real symbols behind the encoding, but the browser does.

Let's try this again with ASCII:

file5.php saved in ASCII format:

<?php
    echo str_replace("é","é","Fédération Camerounaise de Football");

Output:

Fédération Camerounaise de Football

Magic! The browser got everything right now. But whats the real solution? Well. If you have a string hardcoded in your PHP file, then you should simply write Fédération Camerounaise de Football instead of placing the god damn thing wrong. But if you are fetching it from another file or a database, you should take one of the two courses:

  1. Use utf8_decode() to transform the data you fetch into your desired output.

  2. Don't transform anything and use header('Content-Type: text/html; charset=utf-8'); to tell the browser you are printing content in UTF-8 format, so it will display things correctly.

like image 70
Havenard Avatar answered Nov 17 '22 02:11

Havenard


//edit after comment

Fédération Camerounaise de Football is an UTF-8 encoded string so i don't know what input is not utf-8 encoded in your document but you have two options.

  1. your input that are passed to str_replace is utf-8 but the characters that you have used in the functions to replace are ANSII or something else => not work - this means your document is not utf-8 - this is why uft8_decode works str_replace(ANSII, ANSII, CONVERT_TO_ANSII(UTF-8))

  2. your input is not utf-8 and your document is - so this would work str_replace(UTF-8, UTF-8, CONVERT_TO_UTF-8(ANSII))


str_replace works great with multibyte characters - your problem is not the function its is because you try to replace different encoding types. instead of using a alternative function - i suggest you to fix the input that are passed to str_replace to utf-8 and make sure that your document is utf-8 encoded too.

if your source only support non utf-8 encoding use utf8_encode to convert your input to utf-8

http://php.net/manual/de/function.utf8-encode.php

like image 26
ins0 Avatar answered Nov 17 '22 00:11

ins0