Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I decode HTML entities in C++?

Tags:

c

html

I'm interested in unescaping text for example: \ maps to \ in C. Does anyone know of a good library?

As reference the Wikipedia List of XML and HTML Character Entity References.

like image 326
FelipeC Avatar asked Jul 04 '09 12:07

FelipeC


People also ask

How do I decrypt HTML code?

Load the HTML data to decode from a file, then press the 'Decode' button: Browse: Alternatively, type or paste in the text you want to HTML–decode, then press the 'Decode' button.

What is HTML entity decode?

HTML encoding converts characters that are not allowed in HTML into character-entity equivalents; HTML decoding reverses the encoding. For example, when embedded in a block of text, the characters < and > are encoded as &lt; and &gt; for HTTP transmission.

How do you show entities in HTML?

You have to use HTML character entities &lt; and &gt; in place of the < and > symbols so they aren't interpreted as HTML tags. Save this answer.


1 Answers

For another open source reference in C to decoding these HTML entities you can check out the command line utility uni2ascii/ascii2uni. The relevant files are enttbl.{c,h} for entity lookup and putu8.c which down converts from UTF32 to UTF8.

uni2ascii

like image 186
Cameron Lowell Palmer Avatar answered Sep 24 '22 03:09

Cameron Lowell Palmer