Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I parse a HTML string in Java?

Tags:

java

html

parsing

Given the string "<table><tr><td>Hello World!</td></tr></table>", what is the (easiest) way to get a DOM Element representing it?

like image 245
IttayD Avatar asked Sep 30 '09 13:09

IttayD


People also ask

How do you parse an HTML response in Java?

jsoup can parse HTML files, input streams, URLs, or even strings. It eases data extraction from HTML by offering Document Object Model (DOM) traversal methods and CSS and jQuery-like selectors. jsoup can manipulate the content: the HTML element itself, its attributes, or its text.

How do you parse a string in Java?

To parse a string in Java, you can use the Java String split() method, Java Scanner class, or StringUtils class. For parsing a string based on the specified condition, these methods use delimiters to split the string.


1 Answers

If you have a string which contains HTML you can use Jsoup library like this to get HTML elements:

String htmlTable= "<table><tr><td>Hello World!</td></tr></table>";
Document doc = Jsoup.parse(htmlTable);

// then use something like this to get your element:
Elements tds = doc.getElementsByTag("td");

// tds will contain this one element: <td>Hello World!</td>

Good luck!

like image 61
zygimantus Avatar answered Sep 18 '22 08:09

zygimantus