According to https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot, with HtmlUnit (2.13) I am trying to create a snapshot for a webpage using AngularJS (1.2.1).
My Java code is:
WebClient webClient = new WebClient();
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.setCssErrorHandler(new SilentCssErrorHandler());
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setRedirectEnabled(false);
webClient.getOptions().setAppletEnabled(false);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setPopupBlockerEnabled(true);
webClient.getOptions().setTimeout(10000);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(true);
webClient.getOptions().setThrowExceptionOnScriptError(true);
webClient.getOptions().setPrintContentOnFailingStatusCode(true);
HtmlPage page = webClient.getPage(new WebRequest(new URL("..."), HttpMethod.GET));
webClient.waitForBackgroundJavaScript(5000);
String result = page.asXml();
Although webClient.getPage(...)
does not throws any exception the result string still contains "unevaluated angular expressions" such as
<div>
{{name}}
</div>
I am aware of http://htmlunit.10904.n7.nabble.com/htmlunit-to-scrape-angularjs-td29931.html#a30075 but the recomendation given there does not work either.
Of course the same GET-request works without exceptions in all current browsers.
Any ideas/experiences how to get HtmlUnit working with AngularJS?
Update:
I created a HTMLUnit bug report.
For the moment, I switched my implementation to PhantomJS. Maybe this code snippet helps others with a similar problem:
System.setProperty("phantomjs.binary.path", "phantomjs.exe");
DesiredCapabilities caps = new DesiredCapabilities();
caps.setJavascriptEnabled(true);
caps.setCapability("takesScreenshot", false);
PhantomJSDriver driver = new PhantomJSDriver(caps);
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
driver.get(new URL("..."));
String result = driver.getPageSource();
Update2: I stoped rendering my pages manually as the Google crawler renders Angular sites itself now
I had the same problem but could not use explicit bootstrapping because angular e2e tests don't work with explicit bootstrap.
I solved the problem by using
<html id="ng-app" class="ng-app: appmodule;">
instead of
<html ng-app="appmodule">
htmlunit tests work and e2e tests work as well.
Very likely, htmlunit doesn't (fully?) support document.querySelectorAll(). This method is used by angularInit() to find ng-app directives.
The syntactic variant for the ng-app directive works around the document.querySelectorAll() calls in angularInit().
I had same problem with "unevaluated angular expressions" if I use HtmlUnit. The solution is to bootstrap application manually. Reproduction steps:
Minimal example of app working in browser, but not with HtmlUnit:
<!doctype html>
<html ng-app>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js"></script>
</head>
<body>
<div>
<label>Name:</label> <input type="text" ng-model="yourName"
placeholder="Enter a name here">
<hr>
<h1>Hello {{yourName}}!</h1>
</div>
</body>
</html>
Modification steps:
If you use $http or like you should re-sync it with:
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
And now working example:
<!doctype html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js"></script>
<script>
angular.element(document).ready(function() {
angular.module('myApp', []);
angular.bootstrap(document, ['myApp']);
});
</script>
</head>
<body>
<div>
<label>Name:</label> <input type="text" ng-model="yourName"
placeholder="Enter a name here">
<hr>
<h1>Hello {{yourName}}!</h1>
</div>
</body>
</html>
Test:
WebClient webClient = new WebClient();
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage page = webClient.getPage("http://localhost:8080/index.html");
// Initial state
assertEquals("Hello !", page.getElementsByTagName("h1").get(0).asText());
// Set value
((HtmlInput)page.getElementsByTagName("input").get(0)).setValueAttribute("world");
// New state
assertEquals("Hello world!", page.getElementsByTagName("h1").get(0).asText());
It's working solution, but not really pleasure solution. I don't know it is problem of HtmlUnit or Angularjs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With