Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTMLUnit not working with AngularJS

According to https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot, with HtmlUnit (2.13) I am trying to create a snapshot for a webpage using AngularJS (1.2.1).

My Java code is:

WebClient webClient = new WebClient();

webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.setCssErrorHandler(new SilentCssErrorHandler());

webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setRedirectEnabled(false);
webClient.getOptions().setAppletEnabled(false);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setPopupBlockerEnabled(true);
webClient.getOptions().setTimeout(10000);

webClient.getOptions().setThrowExceptionOnFailingStatusCode(true);
webClient.getOptions().setThrowExceptionOnScriptError(true);
webClient.getOptions().setPrintContentOnFailingStatusCode(true);

HtmlPage page = webClient.getPage(new WebRequest(new URL("..."), HttpMethod.GET));
webClient.waitForBackgroundJavaScript(5000);
String result = page.asXml();

Although webClient.getPage(...) does not throws any exception the result string still contains "unevaluated angular expressions" such as

<div>
    {{name}}
</div>

I am aware of http://htmlunit.10904.n7.nabble.com/htmlunit-to-scrape-angularjs-td29931.html#a30075 but the recomendation given there does not work either.

Of course the same GET-request works without exceptions in all current browsers.

Any ideas/experiences how to get HtmlUnit working with AngularJS?

Update:

I created a HTMLUnit bug report.
For the moment, I switched my implementation to PhantomJS. Maybe this code snippet helps others with a similar problem:

System.setProperty("phantomjs.binary.path", "phantomjs.exe");
DesiredCapabilities caps = new DesiredCapabilities();
caps.setJavascriptEnabled(true);
caps.setCapability("takesScreenshot", false);

PhantomJSDriver driver = new PhantomJSDriver(caps);
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
driver.get(new URL("..."));
String result = driver.getPageSource();

Update2: I stoped rendering my pages manually as the Google crawler renders Angular sites itself now

like image 480
cnmuc Avatar asked Nov 22 '13 19:11

cnmuc


2 Answers

I had the same problem but could not use explicit bootstrapping because angular e2e tests don't work with explicit bootstrap.

I solved the problem by using

<html id="ng-app" class="ng-app: appmodule;"> 

instead of

<html ng-app="appmodule">

htmlunit tests work and e2e tests work as well.

Very likely, htmlunit doesn't (fully?) support document.querySelectorAll(). This method is used by angularInit() to find ng-app directives.

The syntactic variant for the ng-app directive works around the document.querySelectorAll() calls in angularInit().

like image 155
stephanme Avatar answered Nov 17 '22 05:11

stephanme


I had same problem with "unevaluated angular expressions" if I use HtmlUnit. The solution is to bootstrap application manually. Reproduction steps:

Minimal example of app working in browser, but not with HtmlUnit:

<!doctype html>
<html ng-app>
<head>
    <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js"></script>
</head>
<body>
    <div>
        <label>Name:</label> <input type="text" ng-model="yourName"
            placeholder="Enter a name here">
        <hr>
        <h1>Hello {{yourName}}!</h1>
    </div>
</body>
</html>

Modification steps:

  1. Bootstrap manually
  2. Remove ng-app to not bootstrap app twice
  3. If you use $http or like you should re-sync it with:

    webClient.setAjaxController(new NicelyResynchronizingAjaxController());

And now working example:

<!doctype html>
<html>
<head>
    <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js"></script>
    <script>
        angular.element(document).ready(function() {
            angular.module('myApp', []);
            angular.bootstrap(document, ['myApp']);
        });
    </script>
</head>
<body>
    <div>
        <label>Name:</label> <input type="text" ng-model="yourName"
            placeholder="Enter a name here">
        <hr>
        <h1>Hello {{yourName}}!</h1>
    </div>
</body>
</html>

Test:

WebClient webClient = new WebClient();
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage page = webClient.getPage("http://localhost:8080/index.html");

// Initial state
assertEquals("Hello !", page.getElementsByTagName("h1").get(0).asText());

// Set value
((HtmlInput)page.getElementsByTagName("input").get(0)).setValueAttribute("world");

// New state
assertEquals("Hello world!", page.getElementsByTagName("h1").get(0).asText());

It's working solution, but not really pleasure solution. I don't know it is problem of HtmlUnit or Angularjs.

like image 1
Anton Bessonov Avatar answered Nov 17 '22 07:11

Anton Bessonov