Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding Ads on a web page

I'm writing an application that's trying to determine if there are ads on a page. This is currently using brower-driving through selenium webdriver using python.

I figured that a good amount of ads exist inside iframes, and I've made a loop to look inside each frame

browser = webdriver.Chrome()
browser.get("http://cnn.com")

all_iframes = browser.find_elements_by_tag_name("iframe")

for iframe in all_iframes:
   browser.switch_to_frame(iframe)
   print(browser.page_source)
   browser.switch_to_default_content()

browser.quit()

I'm wondering if there is any consistently found tags or tag parameters that I can use across multiple pages to determine if there are ads located on a page (both in and outside of iframes on a page). Would I have to look for instances of stuff like doubleclick or adtech or adblade inside each frame?

Or would I have to generate different rules for checking on a per-page basis?

Anyone in the know about how ads are displayed on pages? Thanks.

like image 477
Fal-Cone Avatar asked Nov 16 '12 19:11

Fal-Cone


1 Answers

You can search by the ad servers.

http://pgl.yoyo.org/as/serverlist.php?hostformat=adblockplus

It would be helpful to look at other projects and see how they handle doing the same task:

http://adblockplus.org/en/source

like image 98
dm03514 Avatar answered Oct 02 '22 15:10

dm03514