I am trying to search through all the html of websites that I reach using selenium webdriver. In selenium, when I have an iframe, I must switch to the iframe and then switch back to the main html to search for other iframes.
However, with nested iframes, this can be quite complicated. I must switch to an iframe, search it for iframes, then switch to one iframe found, search IT for iframes, then to go to another iframe I must switch to the main frame, then have my path saved to switch back to where I was before, etc.
Unfortunately, many pages I've found have iframes within iframes within iframes (and so on).
Is there a simple algorithm for this? Or a better way of doing it?
findElements(By. tagName("iframe")). size(); The above code finds the total number of iframes present inside the page using the tagname 'iframe'.
defaultContent() and driver. switchTo(). parentFrame() is that the first method switches the control to the main web page regardless of the number of frames within the web page, while the second method switches the control to the parent frame of the current frame.
frame(int arg0); Select a frame by its (zero-based) index. That is, if a page has multiple frames (more than 1), the first frame would be at index "0", the second at index "1" and so on. Once the frame is selected or navigated , all subsequent calls on the WebDriver interface are made to that frame.
I was not able to find a website with several layers of nested frames to fully test this concept, but I was able to test it on a site with just one layer of nested frames. So, this might require a bit of debugging to deal with deeper nesting. Also, this code assumes that each of the iframes has a name attribute.
I believe that using a recursive function along these lines will solve the issue for you, and here's an example data structure to go along with it:
def frame_search(path):
framedict = {}
for child_frame in browser.find_elements_by_tag_name('frame'):
child_frame_name = child_frame.get_attribute('name')
framedict[child_frame_name] = {'framepath' : path, 'children' : {}}
xpath = '//frame[@name="{}"]'.format(child_frame_name)
browser.switch_to.frame(browser.find_element_by_xpath(xpath))
framedict[child_frame_name]['children'] = frame_search(framedict[child_frame_name]['framepath']+[child_frame_name])
...
do something involving this child_frame
...
browser.switch_to.default_content()
if len(framedict[child_frame_name]['framepath'])>0:
for parent in framedict[child_frame_name]['framepath']:
parent_xpath = '//frame[@name="{}"]'.format(parent)
browser.switch_to.frame(browser.find_element_by_xpath(parent_xpath))
return framedict
You'd kick it off by calling: frametree = iframe_search([])
, and the framedict
would end up looking something like this:
frametree =
{'child1' : 'framepath' : [], 'children' : {'child1.1' : 'framepath' : ['child1'], 'children' : {...etc}},
'child2' : 'framepath' : [], 'children' : {'child2.1' : 'framepath' : ['child2'], 'children' : {...etc}}}
A note: The reason that I wrote this to use attributes of the frames to identify them instead of just using the result of the find_elements method is that I've found in certain scenarios Selenium will throw a stale data exception after a page has been open for too long, and those responses are no longer useful. Obviously, the frame's attributes are not going to change, so it's a bit more stable to use the xpath. Hope this helps.
Finding iframes solely by HTML element tag or attributes (including ID) appears to be unreliable.
On the other hand, recursively searching by iframe indexes works relatively fine.
def find_all_iframes(driver):
iframes = driver.find_elements_by_xpath("//iframe")
for index, iframe in enumerate(iframes):
# Your sweet business logic applied to iframe goes here.
driver.switch_to.frame(index)
find_all_iframes(driver)
driver.switch_to.parent_frame()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With