Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove Non English Characters PHP

Tags:

php

character

how can i parse a string to remove all non english characters in php

right now I want to remove things like

სოფო ნი�

Thanks :)

like image 968
Belgin Fish Avatar asked Sep 06 '10 23:09

Belgin Fish


3 Answers

$str = preg_replace('/[^\00-\255]+/u', '', $str);
like image 82
aularon Avatar answered Nov 07 '22 09:11

aularon


Your best option would be using iconv, which converts strings to requested character encoding.

iconv('UTF-8', 'ASCII//TRANSLIT',  $yourtext);

with //translit you get a meaningful conversion to ASCII (e.g. ß -> ss). Using //IGNORE will strip non-ascii characters altogether.

iconv('UTF-8', 'ASCII//IGNORE',  $yourtext);

See http://php.net/manual/en/function.iconv.php

like image 11
Tero Lahtinen Avatar answered Nov 07 '22 11:11

Tero Lahtinen


By using preg_replace()

$string = "some სოფო text"; 
$string = preg_replace('/[^a-z0-9_ ]/i', '', $string); 

echo $string;

Granted, you will need to expand the preg_replace pattern, but that is one way to do it. There is probably a better way, I just do not know it.

like image 10
Jim Avatar answered Nov 07 '22 09:11

Jim