I have xml like this:
<formula type="inline">
<default:math xmlns="http://www.w3.org/1998/Math/MathML">
<default:mi>
ℤ
</default:mi>
</default:math>
</formula>
My goal is to get rid of all special entities like ℤ
by replacing them by their numeric entity presentations.
I tried :
$test = <content of the xml>;
$convmap = array(0x80, 0xffff, 0, 0xffff);
$test = mb_encode_numericentity($test, $convmap, 'UTF-8');
But this will not replace the ℤ
Any idea?
My goal is to get:
ℤ
as shown here: http://www.fileformat.info/info/unicode/char/2124/index.htm
Thank you.
Your converter is converting your LaTeX into MathML, not HTML entities. You need something that converts directly into HTML character references, or a MathML to HTML character reference converter.
You should be able to use htmlentities
:
htmlentities($symbolsToEncode, ENT_XML1, 'UTF-8');
http://pt1.php.net/htmlentities
You can change ENT_XML1
to ENT_SUBSTITUTE
and it will return Unicode Replacement Characters or Hex character references.
As an alternative, you could use strtr
to convert the characters to something you specify:
$chars = array(
"\x8484" => "蒄"
...
);
$convertedXML = strtr($xml, $chars);
http://php.net/strtr
Someone has done something similar on GitHub.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With