Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the xpath with get_attribute() in python selenium

This is a somewhat backwards approach to web scraping. I need to locate the xpath of a web element AFTER I have already found it with a text()= identifier

Because the xpath values are different based on what information shows up, I need to use predictable labels inside the row for locating the span text next to found element. I found a simple and reliable way is locating the keyword label and then increasing td integer by one inside the xpath.

    def x_label(self, contains):
         mls_data_xpath = f"//span[text()='{contains}']"
         string = self.driver.find_element_by_xpath(mls_data_xpath).get_attribute("xpath")
         digits = string.split("td[")[1]
         num = int(re.findall(r'(\d+)', digits)[0]) + 1
         labeled_data = f'{string.split("td[")[0]}td[{num}]/span'
         print(labeled_data)
         labeled_text = self.driver.find_element_by_xpath(labeled_data).text
         return labeled_text

I cannot find too much information on .get_attribute() and get_property() so I am hoping there is something like .get_attribute("xpath") but I haven't been able to find it.

Basically, I am taking in a string like "ApprxTotalLivArea" which I can rely on and then increasing the integer after td[0] by 1 to find the span data from cell next door. I am hoping there is something like a get_attributes("xpath") to locate the xpath string from the element I locate through my text()='{contains}' search.

I need to use predictable labels inside the row for locating the span text next to element

like image 705
Richard Modad Avatar asked Jan 20 '26 17:01

Richard Modad


2 Answers

The Remote WebElement does includes the following methods:

  • get_attribute()
  • get_dom_attribute()
  • get_property()

But xpath isn't a valid property of a WebElement. So get_attribute("xpath") will always return NULL

like image 126
undetected Selenium Avatar answered Jan 22 '26 07:01

undetected Selenium


This function iteratively get's the parent until it hits the html element at the top

from selenium import webdriver
from selenium.webdriver.common.by import By


def get_xpath(elm):
    e = elm
    xpath = elm.tag_name
    while e.tag_name != "html":
        e = e.find_element(By.XPATH, "..")
        neighbours = e.find_elements(By.XPATH, "../" + e.tag_name)
        level = e.tag_name
        if len(neighbours) > 1:
            level += "[" + str(neighbours.index(e) + 1) + "]"
        xpath = level + "/" + xpath
    return "/" + xpath

driver = webdriver.Chrome()
driver.get("https://www.stackoverflow.com")
login = driver.find_element(By.XPATH, "//a[text() ='Log in']")
xpath = get_xpath(login)
print(xpath)

assert login == driver.find_element(By.XPATH, xpath)

Hope this helps!

like image 43
Tom Fuller Avatar answered Jan 22 '26 05:01

Tom Fuller



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!