Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can I get the unicode value of a character or vise versa with php?

Tags:

php

unicode

utf-8

Is it possible to input a character and get the unicode value back? for example, i can put &#12103 in html to output "⽇", is it possible to give that character as an argument to a function and get the number as an output without building a unicode table?

$val = someFunction("⽇");//returns 12103 

or the reverse?

$val2 = someOtherFunction(12103);//returns "⽇" 

I would like to be able to output the actual characters to the page not the codes, and I would also like to be able to get the code from the character if possible. The closest I got to what I want is php.net/manual/en/function.mb-decode-numericentity.php but I cant get it working, is this the code I need or am I on the wrong track?

like image 397
Totoro Avatar asked Feb 20 '12 12:02

Totoro


People also ask

How do you find the Unicode of a character?

Go to Insert >Symbol > More Symbols. Find the symbol you want. Tip: The Segoe UI Symbol font has a very large collection of Unicode symbols to choose from. On the bottom right you'll see Character code and from:.

How do I find Unicode value?

We can determine the unicode category for a particular character by using the getType() method. It is a static method of Character class and it returns an integer value of char ch representing in unicode general category.

Does PHP support Unicode?

PHP does not offer native Unicode support. PHP only supports a 256-character set. However, PHP provides the UTF-8 functions utf8_encode() and utf8_decode() to provide some basic Unicode functionality. See the PHP manual for strings for more details about PHP and Unicode.

Which function is used to print the Unicode value of a character?

The ord function in python accepts a single character as an argument and returns an integer value representing the Unicode equivalent of that character.


2 Answers

function _uniord($c) {     if (ord($c[0]) >=0 && ord($c[0]) <= 127)         return ord($c[0]);     if (ord($c[0]) >= 192 && ord($c[0]) <= 223)         return (ord($c[0])-192)*64 + (ord($c[1])-128);     if (ord($c[0]) >= 224 && ord($c[0]) <= 239)         return (ord($c[0])-224)*4096 + (ord($c[1])-128)*64 + (ord($c[2])-128);     if (ord($c[0]) >= 240 && ord($c[0]) <= 247)         return (ord($c[0])-240)*262144 + (ord($c[1])-128)*4096 + (ord($c[2])-128)*64 + (ord($c[3])-128);     if (ord($c[0]) >= 248 && ord($c[0]) <= 251)         return (ord($c[0])-248)*16777216 + (ord($c[1])-128)*262144 + (ord($c[2])-128)*4096 + (ord($c[3])-128)*64 + (ord($c[4])-128);     if (ord($c[0]) >= 252 && ord($c[0]) <= 253)         return (ord($c[0])-252)*1073741824 + (ord($c[1])-128)*16777216 + (ord($c[2])-128)*262144 + (ord($c[3])-128)*4096 + (ord($c[4])-128)*64 + (ord($c[5])-128);     if (ord($c[0]) >= 254 && ord($c[0]) <= 255)    //  error         return FALSE;     return 0; }   //  function _uniord() 

and

function _unichr($o) {     if (function_exists('mb_convert_encoding')) {         return mb_convert_encoding('&#'.intval($o).';', 'UTF-8', 'HTML-ENTITIES');     } else {         return chr(intval($o));     } }   // function _unichr() 
like image 73
Mark Baker Avatar answered Oct 17 '22 22:10

Mark Baker


Here's a more compact implementation of unichr/uniord based on pack:

// code point to UTF-8 string function unichr($i) {     return iconv('UCS-4LE', 'UTF-8', pack('V', $i)); }  // UTF-8 string to code point function uniord($s) {     return unpack('V', iconv('UTF-8', 'UCS-4LE', $s))[1]; } 
like image 40
bobince Avatar answered Oct 17 '22 21:10

bobince