Convert HTML Character Back to Text Using Java Standard Library

People also ask

How do I convert HTML text to normal text in Java?

Just call the method html2text with passing the html text and it will return plain text.

I think the Apache Commons Lang library's StringEscapeUtils.unescapeHtml3() and unescapeHtml4() methods are what you are looking for. See https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/StringEscapeUtils.html.

Here you have to just add jar file in lib jsoup in your application and then use this code.

import org.jsoup.Jsoup;

public class Encoder {
    public static void main(String args[]) {
        String s = Jsoup.parse("&lt;Fran&ccedil;ais&gt;").text();
        System.out.print(s);
    }
}

Link to download jsoup: http://jsoup.org/download

java.net.URLDecoder deals only with the application/x-www-form-urlencoded MIME format (e.g. "%20" represents space), not with HTML character entities. I don't think there's anything on the Java platform for that. You could write your own utility class to do the conversion, like this one.

The URL decoder should only be used for decoding strings from the urls generated by html forms which are in the "application/x-www-form-urlencoded" mime type. This does not support html characters.

After a search I found a Translate class within the HTML Parser library.

You can use the class org.apache.commons.lang.StringEscapeUtils:

String s = StringEscapeUtils.unescapeHtml("Happy &amp; Sad")

It is working.

I'm not aware of any way to do it using the standard library. But I do know and use this class that deals with html entities.

"HTMLEntities is an Open Source Java class that contains a collection of static methods (htmlentities, unhtmlentities, ...) to convert special and extended characters into HTML entitities and vice versa."

http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=htmlentities

Or you can use unescapeHtml4:

    String miCadena="GU&#205;A TELEF&#211;NICA";
    System.out.println(StringEscapeUtils.unescapeHtml4(miCadena));

This code print the line: GUÍA TELEFÓNICA

Related questions
                            
                                The ad size and ad unit ID must be set before loadAd when set programmatically
                            
                                Android Studio 3.1.3 - Unresolved reference: R - Kotlin
                            
                                Java generic return type
                            
                                Creating a new ArrayList in Java
                            
                                Convert Java String to sql.Timestamp
                            
                                Parameterized Strings in Java
                            
                                Maven: Including a META-INF folder in the classes folder
                            
                                Ant error when trying to build file, can't find tools.jar?
                            
                                How to identify object types in java [duplicate]
                            
                                How to get host name with port from a http or https request
                            
                                Why I am getting java.lang.AbstractMethodError errors?
                            
                                What is a shaded jar? And what is the difference/similarities between uber jar and shaded jar? [duplicate]
                            
                                Why is Collection not simply treated as Collection<?>
                            
                                JavaFX, Casting ArrayList to ObservableList
                            
                                Formatting Currencies in Foreign Locales in Java
                            
                                ProGuard for Android and GSON
                            
                                Spring mvc Ambiguous mapping found. Cannot map controller bean method
                            
                                How to change ${USER} variable in IntelliJ IDEA without changing OS user name?
                            
                                Creating a factory method in Java that doesn't rely on if-else
                            
                                What is an instance variable in Java?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convert HTML Character Back to Text Using Java Standard Library

Tags:

java

html

html-entities

People also ask

Recent Activity

Donate For Us