Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interesting "getElementById() takes exactly 1 argument (2 given)", sometimes it occurs. Can someone explain it?

#-*- coding:utf-8 -*-
import win32com.client, pythoncom
import time

ie = win32com.client.DispatchEx('InternetExplorer.Application.1')
ie.Visible = 1
ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep( 5 )

ie.Document.getElementById("browse_keyword").value ="Computer"
ie.Document.getElementsByTagName("input")[24].click()

import win32com.client, pythoncom
import time

ie = win32com.client.DispatchEx('InternetExplorer.Application')
ie.Visible = 1
ie.Navigate('www.baidu.com')
time.sleep(5)

print 'browse_keword'
ie.Document.getElementById("kw").value ="Computer"
ie.Document.getElementById("su").click()
print 'Done!'

When run the first section code,it will popup:

ie.Document.getElementById("browse_keyword").value ="Computer"
TypeError: getElementById() takes exactly 1 argument (2 given)

And the second section code runs ok. What is the difference that making the result different?

like image 621
Tre Mi Avatar asked Mar 22 '12 05:03

Tre Mi


3 Answers

The difference between the two cases has nothing to do with the COM name you specify: either InternetExplorer.Application or InternetExplorer.Application.1 result in the exact same CLSID which gives you an IWebBrowser2 interface. The difference in runtime behaviour is purely down to the URL you retrieved.

The difference here may be that the page which works is HTML whereas the other one is XHTML; or it may simply be that errors in the failing page prevent the DOM initialising properly. Whichever it appears to be a 'feature' of the IE9 parser.

Note that this doesn't happen if you enable compatibility mode (after the second line below I clicked the compatibility mode icon in the address bar):

(Pdb) ie.Document.DocumentMode
9.0
(Pdb) ie.Document.getElementById("browse_keyword").value
*** TypeError: getElementById() takes exactly 1 argument (2 given)
(Pdb) ie.Document.documentMode
7.0
(Pdb) ie.Document.getElementById("browse_keyword").value
u''

Unfortunately I don't know how to toggle compatibility mode from a script (the documentMode property is not settable). Maybe someone else does?

The wrong argument count is, I think, coming from COM: Python passes in the arguments and the COM object rejects the call with a misleading error.

like image 172
Duncan Avatar answered Nov 11 '22 21:11

Duncan


As a method of a COMObject, getElementById is built by win32com dynamically.
On my computer, if url is http://ieeexplore.ieee.org/xpl/periodicals.jsp, it will be almost equivalent to

def getElementById(self):
    return self._ApplyTypes_(3000795, 1, (12, 0), (), 'getElementById', None,)

If the url is www.baidu.com, it will be almost equivalent to

def getElementById(self, v=pythoncom.Missing):
    ret = self._oleobj_.InvokeTypes(1088, LCID, 1, (9, 0), ((8, 1),),v
            )
    if ret is not None:
        ret = Dispatch(ret, 'getElementById', {3050F1FF-98B5-11CF-BB82-00AA00BDCE0B})
    return ret

Obviously, if you pass an argument to the first code, you'll receive a TypeError. But if you try to use it directly, namely, invoke ie.Document.getElementById(), you won't receive a TypeError, but a com_error.

Why win32com built the wrong code?
Let us look at ie and ie.Document. They are both COMObjects, more precisely, win32com.client.CDispatch instances. CDispatch is just a wrapper class. The core is attribute _oleobj_, whose type is PyIDispatch.

>>> ie, ie.Document
(<COMObject InternetExplorer.Application>, <COMObject <unknown>>)
>>> ie.__class__, ie.Document.__class__
(<class win32com.client.CDispatch at 0x02CD00A0>,
 <class win32com.client.CDispatch at 0x02CD00A0>)
>>> oleobj = ie.Document._oleobj_
>>> oleobj
<PyIDispatch at 0x02B37800 with obj at 0x003287D4>

To build getElementById, win32com needs to get the type information for getElementById method from _oleobj_. Roughly, win32com uses the following procedure

typeinfo = oleobj.GetTypeInfo()
typecomp = typeinfo.GetTypeComp()
x, funcdesc = typecomp.Bind('getElementById', pythoncom.INVOKE_FUNC)
......

funcdesc contains almost all import information, e.g. the number and types of the parameters.
If url is http://ieeexplore.ieee.org/xpl/periodicals.jsp, funcdesc.args is (), while the correc funcdesc.args should be ((8, 1, None),).

Long story in short, win32com had retrieved the wrong type information, thus it built the wrong method.
I am not sure who is to blame, PyWin32 or IE. But base on my observation, I found nothing wrong in PyWin32's code. On the other hand, the following script runs perfectly in Windows Script Host.

var ie = new ActiveXObject("InternetExplorer.Application");
ie.Visible = 1;
ie.Navigate("http://ieeexplore.ieee.org/xpl/periodicals.jsp");
WScript.sleep(5000);
ie.Document.getElementById("browse_keyword").value = "Computer";

Duncan has already pointed out IE's compatibility mode can prevent the problem. Unfortunately, it seems it's impossible to enable compatibility mode from a script.
But I found a trick, which can help us bypass the problem.

First, you need to visit a good site, which gives us a HTML page, and retrieve a correct Document object from it.

ie = win32com.client.DispatchEx('InternetExplorer.Application')
ie.Visible = 1
ie.Navigate('http://www.haskell.org/arrows')
time.sleep(5)
document = ie.Document

Then jump to the page which doesn't work

ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep(5)

Now you can access the DOM of the second page via the old Document object.

document.getElementById('browse_keyword').value = "Computer"

If you use the new Document object, you will get a TypeError again.

>>> ie.Document.getElementById('browse_keyword')
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: getElementById() takes exactly 1 argument (2 given)
like image 33
nymk Avatar answered Nov 11 '22 19:11

nymk


I just got this issue when I upgraded to IE11 from IE8.

I've only tested this on the getElementsByTagName function. You have to call the function from the Body element.

#-*- coding:utf-8 -*-
import win32com.client, pythoncom
import time

ie = win32com.client.DispatchEx('InternetExplorer.Application.1')
ie.Visible = 1
ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep( 5 )

ie.Document.Body.getElementById("browse_keyword").value ="Computer"
ie.Document.Body.getElementsByTagName("input")[24].click()
like image 2
user3268142 Avatar answered Nov 11 '22 19:11

user3268142