How to remove HTML Entities using Jsoup? If I use Element.toString(), I get: <pre class="prettyprint"><code>(...) <td>Letter &oacute;</td> //valid: <td>Letter ó</td> (...) </code></pre>

This may be off-topic to the context of your question, but if you want to just decode HTML-entities without any other changes in the string (no tag processing, no comment stripping, etc) you can use <code>org.jsoup.parser.Parser.unescapeEntities</code>, e.g.: <pre class="prettyprint"><code>assert Parser.unescapeEntities("x &asymp; y\n", true) .equals("x ≈ y\n"); </code></pre>

How to remove HTML Entities in Jsoup?

Tags:

java

html

jsoup

How to remove HTML Entities using Jsoup? If I use Element.toString(), I get:

(...)
       <td>Letter &oacute;</td> //valid: <td>Letter ó</td>
(...)

255

asked Nov 13 '13 20:11

barwnikk

2 Answers

This may be off-topic to the context of your question, but if you want to just decode HTML-entities without any other changes in the string (no tag processing, no comment stripping, etc) you can use org.jsoup.parser.Parser.unescapeEntities, e.g.:

assert Parser.unescapeEntities("x &asymp; <i>y</i>\n", true)
    .equals("x ≈ <i>y</i>\n");

answered Sep 27 '22 23:09

Sasha

I believe you can specify an encoding when you create a Jsoup Document something like this:

Document newDocument = Jsoup.parse(htmlString, StringUtils.EMPTY, Parser.htmlParser());
newDocument.outputSettings().escapeMode(EscapeMode.base);
newDocument.outputSettings().charset(CharEncoding.UTF_8);

answered Sep 27 '22 23:09

Алексей

Related questions
                            
                                draw rectangle in Jpanel
                            
                                What does <R extends TableRecord<R>> mean in Java?
                            
                                Obtain file name from FileReader object
                            
                                Unary incrementers ++x and x++ in Java
                            
                                Eclipse - Generating Getters/Setters
                            
                                maven, execution tag, id tag missing
                            
                                Expected number of maxima
                            
                                How do I make a Map<String, List<String> unmodifiable? [duplicate]
                            
                                Compare and contrast interfaces in Java and Delphi
                            
                                Better way to find matches in two sorted lists than using for loops?
                            
                                When Java class is instance of Serializable
                            
                                Java Implementation of Shamir's Secret Sharing
                            
                                JPA on Java SE: Object: entity.Customer@5e80188f is not a known entity type
                            
                                Why clone() doesn't work as expected?
                            
                                Codes to generate a public key in an elliptic curve algorithm using a given private key
                            
                                BIRT Variable - How to create and use?
                            
                                Java: list files recursively in subdirectories with Apache commons-IO 2.4
                            
                                The equation -e**-((-log(7)/100.0)*(100-x))+7 returns NaN
                            
                                Can I call a method before my application go to crash
                            
                                Nulling out final variable [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With