I know you may think this question is stupid, but I need to use HtmlUnit. However, it returns a page either as XML or as text.
I don't how to get the pure HTML (the same as the source code that browsers return)
I need this, because I need to use some written modules. Any ideas?
Enable/Disable JavaScript support You can change this to silently (HtmlUnit will still log the exceptions) ignore them (like in real browsers) by setting the option throwExceptionOnScriptError to false. final WebClient webClient = new WebClient(); webClient. getOptions. setThrowExceptionOnScriptError(false);
You can use the following piece of code to achieve your goal:
WebClient webClient = new WebClient();
Page page = webClient.getPage("http://example.com");
WebResponse response = page.getWebResponse();
String content = response.getContentAsString();
See javadocs of the WebResponse.html#getContentAsString() method.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With