I'm trying to learn more about HTMLunit and doing some tests at the moment. I am trying to get basic information such as page title and text from this site:
https://....com (removed the full url, important part is that it is https)
The code I use is this, which is working fine on other websites:
final WebClient webClient = new WebClient();
final HtmlPage page;
page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
System.out.println(page.getTitleText());
System.out.println(page.asText());
Why can't I get this basic information ? If it is because of security measures, what are the specifics and can I bypass them ? Thanks.
Edit:Hmm the code stops working after webclient.getpage(); , test2 is not written. So I can not check if page is null or not.
final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_2);
final HtmlPage page;
System.out.println("test1");
try {
page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
System.out.println("test2");
I solved this by adding this line of code:
webClient.setUseInsecureSSL(true);
which is deprecated way of disabling secure SSL. In current HtmlUnit version you have to do:
webClient.getOptions().setUseInsecureSSL(true);
I think that this is an authentication problem - If I go tho that page in Firefox I get a login box.
Try
webClient.setAuthentication(realm,username,password);
before the call the getPage()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With