Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP Escape a string if it hasn't already been escaped with entities

I'm using a 3rd party API that seems to return its data with the entity codes already in there. Such as The Lion’s Pride.

If I print the string as-is from the API it renders just fine in the browser (in the example above it would put in an apostrophe). However, I can't trust that the API will always use the entities in the future so I want to use something like htmlentities or htmlspecialchars myself before I print it. The problem with this is that it will encode the ampersand in the entity code again and the end result will be The Lion’s Pride in the HTML source which doesn't render anything user friendly.

How can I use htmlentities or htmlspecialchars only if it hasn't already been used on the string? Is there a built-in way to detect if entities are already present in the string?

like image 516
The Unknown Dev Avatar asked Nov 08 '22 07:11

The Unknown Dev


2 Answers

No one seems to be answering your actual question, so I will

How can I use htmlentities or htmlspecialchars only if it hasn't already been used on the string? Is there a built-in way to detect if entities are already present in the string?

It's impossible. What if I'm making an educational post about HTML entities and I want to actually print this on the screen:

The Lion’s Pride

... it would need to be encoded as...

The Lion’s Pride 

But what if that was the actual string we wanted to print on the string ? ... and so on.


Bottom line is, you have to know what you've been given and work from there – which is where the advice from the other answers comes in – which is still just a workaround.

What if they give you double-encoded strings? What if they start wrapping the html-encoded strings in XML? And then wrap that in JSON? ... And then the JSON is converted to binary strings? the possibilities are endless.

It's not impossible for the API you depend on to suddenly switch the output type, but it's also a pretty big violation of the original contract with your users. To some extent, you have to put some trust in the API to do what it says it's going to do. Unit/Integration tests make up the rest of the trust.

And because you could never write a program that works for any possible change they could make, it's senseless to try to anticipate any change at all.

like image 87
Mulan Avatar answered Nov 14 '22 23:11

Mulan


Decode the string, then re-encode the entities. (Using html_entity_decode())

$string = htmlspecialchars(html_entity_decode($string));

https://eval.in/662095

like image 24
ʰᵈˑ Avatar answered Nov 14 '22 23:11

ʰᵈˑ