Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert UTF8 characters to numeric character entities in PHP

Is a translation of the below code at all possible using PHP?

The code below is written in JavaScript. It returns html with numeric character references where needed. Ex. smslån -> smslån

I have been unsuccessful at creating a translation. This script looked like it may work, but returns å for å instead of å as the javascript below does.

function toEntity() {
  var aa = document.form.utf.value;
  var bb = '';
  for(i=0; i<aa.length; i++)
  {
    if(aa.charCodeAt(i)>127)
    {
      bb += '&#' + aa.charCodeAt(i) + ';';
    }
    else
    {
      bb += aa.charAt(i);
    }
  }
  document.form.entity.value = bb;
}

PHP's ord function sounds like it does the same thing as charCodeAt, but it does not. I get 195 for å using ord and 229 using charCodeAt. That, or I am having some incredibly difficult encoding problems.

like image 492
darkAsPitch Avatar asked Sep 30 '11 09:09

darkAsPitch


People also ask

What is the use of Htmlentities () function in PHP?

htmlentities() Function: The htmlentities() function is an inbuilt function in PHP that is used to transform all characters which are applicable to HTML entities. This function converts all characters that are applicable to HTML entities.

Does PHP use utf8?

The utf8_encode() function is an inbuilt function in PHP which is used to encode an ISO-8859-1 string to UTF-8.

How do I UTF-8 encode a string in PHP?

PHP utf8_encode() Function$text = "\xE0"; echo utf8_encode($text);


1 Answers

Use mb_encode_numericentity:

$convmap = array(0x80, 0xffff, 0, 0xffff);
echo mb_encode_numericentity($utf8Str, $convmap, 'UTF-8');
like image 200
phihag Avatar answered Oct 21 '22 13:10

phihag