Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

search arabic letters in arabic words

Here is my working code:

<!DOCTYPE HTML>
<html>
    <head>
        <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
    </head>
    <body>
        <?php
            $arabic = "صحيفة اسبوعية مستقلة شاملة تتابع الاخبار فى المنطقة العربية";
            $french = "que voulez vous dire?";

            if (isset($_POST['search'])) {
                $search = $_POST['search'];
                $key = $_POST['key'];
                $td = substr_count($arabic, $key);
                echo $td;
            }

            echo "<br />" . $arabic;

            function count_occurences($char_string, $haystack, $case_sensitive = true) {
                if ($case_sensitive === false) {
                    $char_string = strtolower($char_string);
                    $haystack = strtolower($haystack);
                }

                $characters = preg_split('//u', $char_string, -1, PREG_SPLIT_NO_EMPTY);
                //$characters = str_split($char_string);
                $character_count = 0;

                foreach ($characters as $character) {
                    $character_count = $character_count + substr_count($haystack, $character);
                }

                return $character_count;
            }
        ?>
        <form name="input" action="" method="post">
            <input  type= "text" name="key" value=""/>
            <input  type ="submit" name="search" value =" find it !"/>
        </form> 
    </body>
</html>

For the $french it works good, however with $arabic it doesn't. Of course there is no error but if I enter for example ح to search for that letter, it shows always 0 for every letter I enter.

Is there some wrong? Or am I missing something with Arabic? I don't know why in $french works good if i enter v it shows 2 in result.

like image 694
echo_Me Avatar asked Mar 30 '13 13:03

echo_Me


People also ask

Are there 28 or 29 letters in Arabic?

The Arabic alphabet has 28 letters, all representing consonants, and is written from right to left.


2 Answers

You need to use Multibyte String Functions.

You can also set mbstring.func_overload = 7 in your php.ini, and php will automatically use multibyte counterparts for standard string functions.

Look at mbstring overloading documentation if you want to use some other value for overloaded functions which would suit your needs better

Also, replace

$characters = str_split($char_string);

with

$characters = preg_split('//u', $char_string, -1, PREG_SPLIT_NO_EMPTY);

because str_split is not multibyte safe and has no alternative

Additionaly, if no encoding is sent in the headers after you submit the form, or there is some issue with them, you can set in your php.ini

default_charset = "UTF-8"

like image 135
Marko D Avatar answered Oct 14 '22 22:10

Marko D


i tested your code with Encoding UTF-8, and it's work..

i'v added a meta tag:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
like image 21
mehdi Avatar answered Oct 14 '22 23:10

mehdi