Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a faster way to decode html characters to a string than Html.fromHtml()?

Tags:

I am using Html.fromHtml(STRING).toString() to convert a string that may or may not have html and/or html entities in it, to a plain text string.

This is pretty slow, I think my last calculation was that it took about 22ms on avg. With a large batch of these it can add over a minute. So I am looking for a faster, performance built option.

Is there anyway to speed this up or are there other decoding options available?

Edit: Since there doesn't appear to be a built in method that is faster or built for performance specifically, I will reward the bounty to anyone that can point me in the direction of a library that:

  • Works well with Android
  • Licensed for free use
  • Faster than Html.fromHtml(String).toString();

As a note, I already tried Jsoup with this method: Jsoup.parse(String).text() and it was slower.

like image 304
cottonBallPaws Avatar asked Dec 01 '10 06:12

cottonBallPaws


People also ask

What is HTML decode?

HtmlDecode(String) Converts a string that has been HTML-encoded for HTTP transmission into a decoded string. HtmlDecode(String, TextWriter) Converts a string that has been HTML-encoded into a decoded string, and sends the decoded string to a TextWriter output stream.


2 Answers

What about org.apache.commons.lang.StringEscapeUtils's unescapeHtml(). The library is available on Apache site.

(EDIT: June 2019 - See the comments below for updates about the library)

like image 106
karlcow Avatar answered Sep 19 '22 10:09

karlcow


fromHtml() does not have a high-performance HTML parser, and I have no idea how quick the toString() implementation on SpannedString is. I doubt either were designed for your scenario.

Ideally, the strings are clean before they get to a low-power phone. Either clean them up in the build process (for resources/assets), or clean them up on a server (before you download them).

If, for whatever reason, you absolutely need to clean them up on the device, you can perhaps use the NDK to create a C/C++ library that does the cleaning for you faster.

like image 39
CommonsWare Avatar answered Sep 22 '22 10:09

CommonsWare