Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't HTMLunit work on this https webpage?

I'm trying to learn more about HTMLunit and doing some tests at the moment. I am trying to get basic information such as page title and text from this site:

https://....com (removed the full url, important part is that it is https)

The code I use is this, which is working fine on other websites:

 final WebClient webClient = new WebClient();
  final HtmlPage page;
  page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
  System.out.println(page.getTitleText());
  System.out.println(page.asText());

Why can't I get this basic information ? If it is because of security measures, what are the specifics and can I bypass them ? Thanks.

Edit:Hmm the code stops working after webclient.getpage(); , test2 is not written. So I can not check if page is null or not.

  final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_2);
  final HtmlPage page;
  System.out.println("test1");
    try {
        page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
      System.out.println("test2");
like image 947
Tunca Ersoy Avatar asked Dec 12 '22 15:12

Tunca Ersoy


2 Answers

I solved this by adding this line of code:

webClient.setUseInsecureSSL(true);

which is deprecated way of disabling secure SSL. In current HtmlUnit version you have to do:

webClient.getOptions().setUseInsecureSSL(true);
like image 115
Tunca Ersoy Avatar answered Dec 28 '22 10:12

Tunca Ersoy


I think that this is an authentication problem - If I go tho that page in Firefox I get a login box.

Try

webClient.setAuthentication(realm,username,password);

before the call the getPage()

like image 29
DaveH Avatar answered Dec 28 '22 10:12

DaveH