Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HtmlUnit css not applied properly

Tags:

java

css

htmlunit

I try to save the google page using HtmlUnit. But I can't get proper UI. When I check the saved page codes style tags are empty.

My code is here.

public static void main(String[] args) throws IOException {

    FileUtils.cleanDirectory(new File("/home/user1/Documents/Aaa")); 
    WebClient webClient = new WebClient(BrowserVersion.CHROME);
    webClient.getOptions().setCssEnabled(true);
    webClient.getOptions().setJavaScriptEnabled(true);
    webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
    webClient.getOptions().setThrowExceptionOnScriptError(false);
    webClient.waitForBackgroundJavaScriptStartingBefore(1000);
    webClient.waitForBackgroundJavaScript(1000);
    webClient.getOptions().setTimeout(5000);
    System.out.println("******************loaded**********************************");
    try {
        HtmlPage page = webClient.getPage("https://www.google.com");
        page.save(new File("/home/user1/Documents/Aaa/index.html"));
    } catch (Exception e) {
        System.out.println("******************catch***********************************");
        e.printStackTrace();
    }
    webClient.close();
    System.out.println("******************finished********************************");
}

And my page looks like

enter image description here

Console logs

Dec 10, 2016 3:47:45 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
Dec 10, 2016 3:47:46 PM com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError
SEVERE: runtimeError: message=[TypeError: object is not iterable] sourceName=[https://www.google.co.in/xjs/_/js/k=xjs.s.en.igGBAtxEWN0.O/m=sx,c,sb,cdos,cr,elog,hsm,jsa,r,qsm,j,p,d,csi/am=AAiUPF6wAOL_ISBuIRxBasDAoA/rt=j/d=1/t=zcms/rs=ACT90oGjQTdwqicso-l4vNE-7GeAqTtjtw] line=[10] lineSource=[null] lineOffset=[0]
Dec 10, 2016 3:47:46 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://www.google.co.in/?gfe_rd=cr&ei=RtZLWL3cDsmL8QeW4Yb4AQ' [1:14018] Error in expression. (Invalid token " ". Was expecting one of: <NUMBER>, "inherit", <IDENT>, <STRING>, <HASH>, <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <RESOLUTION_DPI>, <RESOLUTION_DPCM>, <PERCENTAGE>, <DIMENSION>, <URI>, <FUNCTION>, "progid:".)
Dec 10, 2016 3:47:46 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://www.google.co.in/?gfe_rd=cr&ei=RtZLWL3cDsmL8QeW4Yb4AQ' [1:14042] Error in expression. (Invalid token " ". Was expecting one of: <NUMBER>, "inherit", <IDENT>, <STRING>, <HASH>, <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <RESOLUTION_DPI>, <RESOLUTION_DPCM>, <PERCENTAGE>, <DIMENSION>, <URI>, <FUNCTION>, "progid:".)
Dec 10, 2016 3:47:46 PM com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine handleJavaScriptException
INFO: Caught script exception
======= EXCEPTION START ========
EcmaError: lineNumber=[551] column=[0] lineSource=[null] name=[TypeError] sourceName=[https://www.google.co.in/xjs/_/js/k=xjs.s.en.igGBAtxEWN0.O/m=sx,c,sb,cdos,cr,elog,hsm,jsa,r,qsm,j,p,d,csi/am=AAiUPF6wAOL_ISBuIRxBasDAoA/rt=j/d=1/t=zcms/rs=ACT90oGjQTdwqicso-l4vNE-7GeAqTtjtw] message=[TypeError: Cannot read property "Kf" from undefined (https://www.google.co.in/xjs/_/js/k=xjs.s.en.igGBAtxEWN0.O/m=sx,c,sb,cdos,cr,elog,hsm,jsa,r,qsm,j,p,d,csi/am=AAiUPF6wAOL_ISBuIRxBasDAoA/rt=j/d=1/t=zcms/rs=ACT90oGjQTdwqicso-l4vNE-7GeAqTtjtw#551)]
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "Kf" from undefined (https://www.google.co.in/xjs/_/js/k=xjs.s.en.igGBAtxEWN0.O/m=sx,c,sb,cdos,cr,elog,hsm,jsa,r,qsm,j,p,d,csi/am=AAiUPF6wAOL_ISBuIRxBasDAoA/rt=j/d=1/t=zcms/rs=ACT90oGjQTdwqicso-l4vNE-7GeAqTtjtw#551)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:921)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:515)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:852)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:824)
    at com.gargoylesoftware.htmlunit.InteractivePage.executeJavaScriptFunctionIfPossible(InteractivePage.java:216)
    at com.gargoylesoftware.htmlunit.javascript.host.event.EventListenersContainer.executeEventListeners(EventListenersContainer.java:258)
    at com.gargoylesoftware.htmlunit.javascript.host.event.EventListenersContainer.executeBubblingListeners(EventListenersContainer.java:322)
    at com.gargoylesoftware.htmlunit.javascript.host.event.EventTarget.fireEvent(EventTarget.java:206)
    at com.gargoylesoftware.htmlunit.javascript.host.Window.dispatchEvent(Window.java:2033)
    at com.gargoylesoftware.htmlunit.javascript.host.Window$2$1.run(Window.java:2119)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:515)
    at com.gargoylesoftware.htmlunit.javascript.host.Window$2.execute(Window.java:2124)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doProcessPostponedActions(JavaScriptEngine.java:966)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.access$500(JavaScriptEngine.java:101)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:916)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:515)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:852)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:824)
    at com.gargoylesoftware.htmlunit.InteractivePage.executeJavaScriptFunctionIfPossible(InteractivePage.java:216)
    at com.gargoylesoftware.htmlunit.javascript.host.event.EventListenersContainer.executeEventListeners(EventListenersContainer.java:258)
    at com.gargoylesoftware.htmlunit.javascript.host.event.EventListenersContainer.executeBubblingListeners(EventListenersContainer.java:322)
    at com.gargoylesoftware.htmlunit.javascript.host.event.EventTarget.fireEvent(EventTarget.java:191)
    at com.gargoylesoftware.htmlunit.html.DomElement$2.run(DomElement.java:1190)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:515)
    at com.gargoylesoftware.htmlunit.html.DomElement.fireEvent(DomElement.java:1195)
    at com.gargoylesoftware.htmlunit.html.DomElement.fireEvent(DomElement.java:1163)
    at com.gargoylesoftware.htmlunit.InteractivePage.setFocusedElement(InteractivePage.java:100)
    at com.gargoylesoftware.htmlunit.InteractivePage.setFocusedElement(InteractivePage.java:66)
    at com.gargoylesoftware.htmlunit.html.HtmlElement.detach(HtmlElement.java:1254)
    at com.gargoylesoftware.htmlunit.html.DomNode.remove(DomNode.java:1132)
    at com.gargoylesoftware.htmlunit.html.DomNode.replace(DomNode.java:1210)
    at com.gargoylesoftware.htmlunit.javascript.host.dom.Node.replaceChild(Node.java:444)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:153)
    at net.sourceforge.htmlunit.corejs.javascript.FunctionObject.call(FunctionObject.java:448)
    at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1540)
    at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:800)
    at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
    at net.sourceforge.htmlunit.corejs.javascript.NativeArray.iterativeMethod(NativeArray.java:1694)
    at net.sourceforge.htmlunit.corejs.javascript.NativeArray.execIdCall(NativeArray.java:405)
    at net.sourceforge.htmlunit.corejs.javascript.IdFunctionObject.call(IdFunctionObject.java:94)
    at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.applyOrCall(ScriptRuntime.java:2575)
    at net.sourceforge.htmlunit.corejs.javascript.BaseFunction.execIdCall(BaseFunction.java:321)
    at net.sourceforge.htmlunit.corejs.javascript.IdFunctionObject.call(IdFunctionObject.java:94)
    at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1540)
    at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:800)
    at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:413)
    at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:252)
    at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3264)
    at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$3.doRun(JavaScriptEngine.java:794)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:906)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:515)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:803)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:779)
    at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:975)
    at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:352)
    at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:238)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doProcessPostponedActions(JavaScriptEngine.java:966)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.access$500(JavaScriptEngine.java:101)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:916)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:515)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:852)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:824)
    at com.gargoylesoftware.htmlunit.InteractivePage.executeJavaScriptFunctionIfPossible(InteractivePage.java:216)
    at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptFunctionJob.runJavaScript(JavaScriptFunctionJob.java:52)
    at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptExecutionJob.run(JavaScriptExecutionJob.java:102)
    at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptJobManagerImpl.runSingleJob(JavaScriptJobManagerImpl.java:426)
    at com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:157)
    at java.lang.Thread.run(Thread.java:745)
Caused by: net.sourceforge.htmlunit.corejs.javascript.EcmaError: TypeError: Cannot read property "Kf" from undefined (https://www.google.co.in/xjs/_/js/k=xjs.s.en.igGBAtxEWN0.O/m=sx,c,sb,cdos,cr,elog,hsm,jsa,r,qsm,j,p,d,csi/am=AAiUPF6wAOL_ISBuIRxBasDAoA/rt=j/d=1/t=zcms/rs=ACT90oGjQTdwqicso-l4vNE-7GeAqTtjtw#551)
    at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3915)

Any idea for this case?

Edited: I think that error is parsing when the CSS3 filter property. In the google page code like this.

filter:progid:DXImageTransform.Microsoft.gradient(startColorstr=#4387fd,endColorstr=#4683ea,GradientType=1)

In this filter has a colon and progrid has a colon. So this leads to CSS Parsing error. Any solution for this?

like image 739
Anbu Avatar asked Dec 05 '16 07:12

Anbu


1 Answers

At first:

WARNING: CSS error: 'https://www.google.co.in/?gfe_rd=cr&ei=hyxFWPuPJMyL8QfigajQCw'

This is a warning, means more or less that the css parser does not like one of the given css rules. This has not real impact because this might lead to some missing format information if the js code on the page ask a html element for the style.

Second

webClient.waitForBackgroundJavaScriptStartingBefore(1000);
webClient.waitForBackgroundJavaScript(1000);

These two calls are no configuration options - doing these calls at the given place is meaningless.

But both are only minor things, the only real problem you have is the

com.gargoylesoftware.htmlunit.ScriptException: Exception invoking constructor

exception. This usually stops the js processing for the page (but not in your case because you have set ThrowExceptionOnScriptError to false).

To analyze the problem we need at least the stack trace of the exception that was thrown by the constructor. Usually this stack trace is also part of the log.

But maybe there is a simpler solution. You have not provided any information about the HtmlUnit version you are using. I like to suggest to use the latest available SNAPSHOT build (you can find some info how to get at http://htmlunit.sourceforge.net/gettingLatestCode.html). I did a quick test with the latest htmlunit code and was able to get the page without any exception.

    public static void main(String[] args) throws IOException {

    FileUtils.cleanDirectory(new File("c://temp/htmlunit/aa")); 
    WebClient webClient = new WebClient(BrowserVersion.CHROME);
    webClient.getOptions().setCssEnabled(true);
    webClient.getOptions().setJavaScriptEnabled(true);
    webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
    webClient.getOptions().setThrowExceptionOnScriptError(false);
    webClient.getOptions().setTimeout(5000);
    System.out.println("******************loaded**********************************");
    try {
        HtmlPage page = webClient.getPage("https://www.google.com");
        webClient.waitForBackgroundJavaScriptStartingBefore(1000);
        webClient.waitForBackgroundJavaScript(1000);
        page.save(new File("c://temp/htmlunit/aa/index.html"));
    } catch (Exception e) {
        System.out.println("******************catch***********************************");
        e.printStackTrace();
    }
    webClient.close();
    System.out.println("******************finished********************************");
}
like image 143
RBRi Avatar answered Oct 05 '22 18:10

RBRi