Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove garbage characters in arabic

Tags:

regex

php

I needed to remove all non Arabic characters from a string and eventually with the help of people from stack-overflow was able to come up with the following regex to get rid of all characters which are not Arabic.

preg_replace('/[^\x{0600}-\x{06FF}]/u','',$string);

The problem is the above removes white spaces too. And now I discovered I would need character from A-Z,a-z,0-9, !@#$%^&*() also. So how do I need to modify the regex?

Thanking you

like image 905
Imran Omar Bukhsh Avatar asked Jul 10 '11 18:07

Imran Omar Bukhsh


2 Answers

Add the ones you want to keep to your character class:

preg_replace('/[^\x{0600}-\x{06FF}A-Za-z !@#$%^&*()]/u','', $string);
like image 79
Ray Toal Avatar answered Oct 11 '22 07:10

Ray Toal


assume you have this string:

$str = "Arabic Text نص عربي test 123 و,.m,............ ~~~ ٍ،]ٍْ}~ِ]ٍ}";

this will keep arabic chars with spaces only.

echo preg_replace('/[^أ-ي ]/ui', '', $str);

this will keep Arabic and English chars with Numbers Only

echo preg_replace('/[^أ-يA-Za-z0-9 ]/ui', '', $str);

this will answer your question latterly.

echo preg_replace('/[^أ-يA-Za-z !@#$%^&*()]/ui', '', $str);
like image 38
Mohammed Ahmed Avatar answered Oct 11 '22 07:10

Mohammed Ahmed