I am creating a file that is to be saved on a local user's computer (not rendered in a web browser).
I am currently using html_entity_decode
, but this isn't converting characters like –
(which is the n-dash) and was wondering what other function I should be using.
For example, when the file is imported into the software, instead of the ndash or just a - it shows up as –
. I know I could use str_replace
, but if it's happening with this character, it could happen with many others since the data is dynamic.
HTML encoding converts characters that are not allowed in HTML into character-entity equivalents; HTML decoding reverses the encoding. For example, when embedded in a block of text, the characters < and > are encoded as < and > for HTTP transmission.
You have to use HTML character entities < and > in place of the < and > symbols so they aren't interpreted as HTML tags.
> and < is a character entity reference for the > and < character in HTML. It is not possible to use the less than (<) or greater than (>) signs in your file, because the browser will mix them with tags. for these difficulties you can use entity names( > ) and entity numbers( < ).
You need to define the target character set. –
is not a valid character in the default ISO-8859-1 character set, so it's not decoded. Define UTF-8 as the output charset and it will decode:
echo html_entity_decode('–', ENT_NOQUOTES, 'UTF-8');
If at all possible, you should avoid HTML entities to begin with. I don't know where that encoded data comes from, but if you're storing it like this in the database or elsewhere, you're doing it wrong. Always store data UTF-8 encoded and only convert to HTML entities or otherwise escape for output when necessary.
Try mb_convert_encoding()
:
$string = "n–dash";
$output = mb_convert_encoding($string, 'UTF-8', 'HTML-ENTITIES');
echo $output;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With