I want to convert all texts in a string into html entities but preserving the HTML tags, for example this:
<p><font style="color:#FF0000">Camión español</font></p>
should be translated into this:
<p><font style="color:#FF0000">Camión español</font></p>
any ideas?
htmlentities() Function: The htmlentities() function is an inbuilt function in PHP that is used to transform all characters which are applicable to HTML entities. This function converts all characters that are applicable to HTML entities.
The htmlentities() function converts characters to HTML entities. Tip: To convert HTML entities back to characters, use the html_entity_decode() function. Tip: Use the get_html_translation_table() function to return the translation table used by htmlentities().
Description. The htmlspecialchars() function is used to converts special characters ( e.g. & (ampersand), " (double quote), ' (single quote), < (less than), > (greater than)) to HTML entities ( i.e. & (ampersand) becomes &, ' (single quote) becomes ', < (less than) becomes < (greater than) becomes > ).
HTML encoding converts characters that are not allowed in HTML into character-entity equivalents; HTML decoding reverses the encoding. For example, when embedded in a block of text, the characters < and > are encoded as < and > for HTTP transmission.
You can get the list of correspondances character => entity used by htmlentities
, with the function get_html_translation_table
; consider this code :
$list = get_html_translation_table(HTML_ENTITIES);
var_dump($list);
(You might want to check the second parameter to that function in the manual -- maybe you'll need to set it to a value different than the default one)
It will get you something like this :
array
' ' => string ' ' (length=6)
'¡' => string '¡' (length=7)
'¢' => string '¢' (length=6)
'£' => string '£' (length=7)
'¤' => string '¤' (length=8)
....
....
....
'ÿ' => string 'ÿ' (length=6)
'"' => string '"' (length=6)
'<' => string '<' (length=4)
'>' => string '>' (length=4)
'&' => string '&' (length=5)
Now, remove the correspondances you don't want :
unset($list['"']);
unset($list['<']);
unset($list['>']);
unset($list['&']);
Your list, now, has all the correspondances character => entity used by htmlentites, except the few characters you don't want to encode.
And now, you just have to extract the list of keys and values :
$search = array_keys($list);
$values = array_values($list);
And, finally, you can use str_replace to do the replacement :
$str_in = '<p><font style="color:#FF0000">Camión español</font></p>';
$str_out = str_replace($search, $values, $str_in);
var_dump($str_out);
And you get :
string '<p><font style="color:#FF0000">Camión español</font></p>' (length=84)
Which looks like what you wanted ;-)
Edit : well, except for the encoding problem (damn UTF-8, I suppose -- I'm trying to find a solution for that, and will edit again)
Second edit couple of minutes after : it seem you'll have to use utf8_encode
on the $search
list, before calling str_replace
:-(
Which means using something like this :
$search = array_map('utf8_encode', $search);
Between the call to array_keys
and the call to str_replace
.
And, this time, you should really get what you wanted :
string '<p><font style="color:#FF0000">Camión español</font></p>' (length=70)
And here is the full portion of code :
$list = get_html_translation_table(HTML_ENTITIES);
unset($list['"']);
unset($list['<']);
unset($list['>']);
unset($list['&']);
$search = array_keys($list);
$values = array_values($list);
$search = array_map('utf8_encode', $search);
$str_in = '<p><font style="color:#FF0000">Camión español</font></p>';
$str_out = str_replace($search, $values, $str_in);
var_dump($str_in, $str_out);
And the full output :
string '<p><font style="color:#FF0000">Camión español</font></p>' (length=58)
string '<p><font style="color:#FF0000">Camión español</font></p>' (length=70)
This time, it should be ok ^^
It doesn't really fit in one line, is might not be the most optimized solution ; but it should work fine, and has the advantage of allowing you to add/remove any correspondance character => entity you need or not.
Have fun !
Might not be terribly efficient, but it works
$sample = '<p><font style="color:#FF0000">Camión español</font></p>';
echo htmlspecialchars_decode(
htmlentities($sample, ENT_NOQUOTES, 'UTF-8', false)
, ENT_NOQUOTES
);
This is optimized version of the accepted answer.
$list = get_html_translation_table(HTML_ENTITIES);
unset($list['"']);
unset($list['<']);
unset($list['>']);
unset($list['&']);
$string = strtr($string, $list);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With