Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Convert Arabic Characters to Unicode Using PHP

I want to to know how can I convert a word into unicode exactly like: http://www.arabunic.free.fr/

can anyone know how to do that using PHP considering that Arabic text may contains ligatures?

thanks

Edit

I'm not sure what is that "unicode" but I need to have the Arabic Character in it's equivalent machine number considering that arabic characters have different contextual forms depending on their position - see here:

http://en.wikipedia.org/wiki/Arabic_alphabet#Table_of_basic_letters

the same character in different position:

ب‎ | ـب‎ | ـبـ‎ | بـ‎

I think it must be a way to convert each Arabic character into it's equivalent number, but how?

Edit

I still believe there's a way to convert each character to it's form depending on positions

any idea is appreciated..

like image 373
Al3bed Avatar asked May 30 '11 10:05

Al3bed


2 Answers

All what you need is function called: utf8Glyphs which you can find it in ArGlyphs.class.php download it from ar-php and visit Ar-PHP for the ArPHP more information about the project and classes.

This will reverse the word with same of its characters (glyphs).

Example of usage:

    <?php
    include('Arabic.php');
    $Arabic = new Arabic('ArGlyphs');

    $text = 'بسم الله الرحمن الرحيم';
    $text = $Arabic->utf8Glyphs($text);
    echo $text;
    ?>
like image 180
FloatBird Avatar answered Oct 02 '22 16:10

FloatBird


i assume you wnat to convert بهروز to \u0628\u0647\u0631\u0648\u0632 take a look at http://hsivonen.iki.fi/php-utf8/ all you have to do after calling unicodeToUtf8('بهروز') is to convert integers you got in array to hex & make sure they have 4digigts & prefix em with \u & you're done. also you can get same using json_encode

json_encode('بهروز') // returns "\u0628\u0647\u0631\u0648\u0632"

EDIT:

seems you want to get character codes of بب which first one differs from second one, all you have to do is applying bidi algorithm on your text using fribidi_log2vis then getting character code by one of ways i said before.

here's example:

$string = 'بب'; // \u0628\u0628
$bidiString = fribidi_log2vis($string, FRIBIDI_LTR, FRIBIDI_CHARSET_UTF8);
json_encode($bidiString); // \ufe90\ufe91

EDIT:

i just remembered that tcpdf has bidi algorithm which implemented using pure php so if you can not get fribidi extension of php to work, you can use tcpdf (utf8Bidi by default is protected so you need to make it public)

require_once('utf8.inc'); // http://hsivonen.iki.fi/php-utf8/
require_once('tcpdf.php'); // http://www.tcpdf.org/
$t = new TCPDF();
$text = 'بب';
$t->utf8Bidi(utf8ToUnicode($text)); // will return an array like array(0 => 65168, 1 => 65169)
like image 38
everplays Avatar answered Oct 02 '22 15:10

everplays