Might be an unclear question so here's the code and explanation:
Document doc = Jsoup.parse(exampleHtmlData);
Elements certainLinks = doc.select("a[href=google.com/example/]");
The String exampleHtmlData contains a parsed HTML source from a certain site. This site has a lot of links which direct the user to google. A few examples would be:
http://google.com/example/hello
http://google.com/example/certaindir/anotherdir/something
http://google.com/anotherexample
I want to extract all the links that contain google.com/example/ in the link with the doc.select function. How do I do this with JSoup?
Jsoup parses the source code as delivered from the server (or in this case loaded from file). It does not invoke client-side actions such as JavaScript or CSS DOM manipulation.
What It Is. jsoup can parse HTML files, input streams, URLs, or even strings. It eases data extraction from HTML by offering Document Object Model (DOM) traversal methods and CSS and jQuery-like selectors. jsoup can manipulate the content: the HTML element itself, its attributes, or its text.
Where. document − document object represents the HTML DOM. Jsoup − main class to parse the given HTML String. html − HTML String. sampleDiv − Element object represent the html node element identified by id "sampleDiv".
You can refer the SelectorSyntax.
Document doc = Jsoup.parse(exampleHtmlData);
Elements certainLinks = doc.select("a[href*=google.com/example/]");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With