Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

htmlentities() double encoding entities in string

Tags:

php

I want only the unencoded characters to get converted to html entities, without affecting the entities which are already present. I have a string that has previously encoded entities, e.g.:

gaIUSHIUGhj>‐ hjb×jkn.jhuh>hh> …

When I use htmlentities(), the & at the beginning of entities gets encoded again. This means ‐ and other entities have their & encoded to &:

×

I tried decoding the complete string, then encoding it again, but it does not seem to work properly. This is the code I tried:

header('Content-Type: text/html; charset=iso-8859-1');
...

$b = 'gaIUSHIUGhj>‐ hjb×jkn.jhuh>hh> …';
$b = html_entity_decode($b, ENT_QUOTES, 'UTF-8');
$b = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $b);
$b = htmlentities($b, ENT_QUOTES, 'UTF-8'); 

But it does not seem to work the right way. Is there a way to prevent or stop this from happening?

like image 430
user2150616 Avatar asked Mar 09 '13 03:03

user2150616


People also ask

What is HTML entities () function?

The htmlentities() function converts characters to HTML entities. Tip: To convert HTML entities back to characters, use the html_entity_decode() function. Tip: Use the get_html_translation_table() function to return the translation table used by htmlentities().

What's the difference between HTML entities () and htmlspecialchars ()?

htmlentities — Convert all applicable characters to HTML entities. htmlspecialchars — Convert special characters to HTML entities.

What is double encode HTML entities?

Double encoding is the act of encoding data twice in a row using the same encoding scheme. It is usually used as an attack technique to bypass authorization schemes or security filters that intercept user input.

What will the function HTML entities transform a double quote character into?

Entity-quoting only HTML syntax characters The following entities are converted: Ampersands ( & ) are converted to & Double quotes ( " ) are converted to " Single quotes ( ' ) are converted to ' (if ENT_QUOTES is on, as described for htmlentities( ) )


1 Answers

Set the optional $double_encode variable to false. See the documentation for more information.

Your resulting code should look like:

$b = htmlentities($b, ENT_QUOTES, 'UTF-8', false);
like image 171
Niet the Dark Absol Avatar answered Sep 25 '22 13:09

Niet the Dark Absol