Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Ublock Origin logger datas using Python and selenium

I'd like to know the number of blocked trackers detected by Ublock Origin using Python (running on linux server, so no GUI) and Selenium (with firefox driver). I don't necessarly need to really block them but i need to know how much there are.

Ublock Origin has a logger (https://github.com/gorhill/uBlock/wiki/The-logger#settings-dialog)) which i'd like to scrap.

This logger is available through an url like this: moz-extension://fc469b55-3182-4104-a95c-6b0b4f87cf0f/logger-ui.html#_ where the part in italic is the UUID of Ublock Origin Addon.

In this logger, for each entry, there is a div with class set to "logEntry" (yellow oblong in the screenshot below), and i'd like to get the datas in the green oblong: enter image description here

So far, i got this:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.firefox.options import Options as FirefoxOptions
browser_options = FirefoxOptions()
browser_options.headless = True
              
#   Activate add on
str_ublock_extension_path = "/usr/local/bin/uBlock0_1.45.3b10.firefox.signed.xpi"
browser = webdriver.Firefox(executable_path='/usr/loca/bin/geckodriver',options=browser_options)        
str_id  = browser.install_addon(str_ublock_extension_path)
        
#   Getting the UUID which is new each time the script is launched
profile_path = browser.capabilities['moz:profile']    
id_extension_firefox = "[email protected]"
with open('{}/prefs.js'.format(profile_path), 'r') as file_prefs:
     lines = file_prefs.readlines()
     for line in lines:
     if 'extensions.webextensions.uuids' in line:
         sublines = line.split(',')
         for subline in sublines:
             if id_extension_firefox in subline:
                internal_uuid = subline.split(':')[1][2:38]
                                    
        str_uoo_panel_url = "moz-extension://" + internal_uuid + "/logger-ui.html#_"
        ubo_logger = browser.get(str_uoo_panel_url)
        ubo_logger_log_entries = ubo_logger.find_element(By.CLASS_NAME, "logEntry")
        
        for log_entrie in ubo_logger_log_entries:
            print(log_entrie.text)
    

Using this "weird" url with moz-extension:// seems to work considering that print(browser.page_source) will display some relevant html code.

Problem: ubo_logger.find_element(By.CLASS_NAME, "logEntry") got nothing. What did i did wrong?

like image 941
8oris Avatar asked Oct 18 '25 12:10

8oris


1 Answers

I found this to work:

parent = driver.find_element(by=By.XPATH, value='//*[@id="vwContent"]')
children = parent.find_elements(by=By.XPATH, value='./child::*')

for child in children:
    attributes = (child.find_element(by=By.XPATH, value='./child::*')).find_elements(by=By.XPATH, value='./child::*')
    print(attributes[4].text)

You could then also do:

if attributes[4].text.isdigit():
    result = int(attributes[4].text)

This converts the resulting text into an int.

like image 61
Teddy Avatar answered Oct 20 '25 02:10

Teddy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!