I'm writing an application that's trying to determine if there are ads on a page. This is currently using brower-driving through selenium webdriver using python.
I figured that a good amount of ads exist inside iframes, and I've made a loop to look inside each frame
browser = webdriver.Chrome()
browser.get("http://cnn.com")
all_iframes = browser.find_elements_by_tag_name("iframe")
for iframe in all_iframes:
browser.switch_to_frame(iframe)
print(browser.page_source)
browser.switch_to_default_content()
browser.quit()
I'm wondering if there is any consistently found tags or tag parameters that I can use across multiple pages to determine if there are ads located on a page (both in and outside of iframes on a page). Would I have to look for instances of stuff like doubleclick or adtech or adblade inside each frame?
Or would I have to generate different rules for checking on a per-page basis?
Anyone in the know about how ads are displayed on pages? Thanks.
You can search by the ad servers.
http://pgl.yoyo.org/as/serverlist.php?hostformat=adblockplus
It would be helpful to look at other projects and see how they handle doing the same task:
http://adblockplus.org/en/source
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With