I am trying to copy the href value from a website, and the html code looks like this: <pre class="prettyprint"><code> <a href="https://www.iproperty.com.my/property/setia-eco-park/sale- 1653165/">Shah Alam Setia Eco Park, Setia Eco Park </a> </code></pre> I've tried <code>driver.find_elements_by_css_selector(".sc-eYdvao.kvdWiq").get_attribute("href")</code> but it returned <code>'list' object has no attribute 'get_attribute'</code>. Using <code>driver.find_element_by_css_selector(".sc-eYdvao.kvdWiq").get_attribute("href")</code> returned <code>None</code>. But i cant use xpath because the website has like 20+ href which i need to copy all. Using xpath would only copy one. If it helps, all the 20+ href are categorised under the same class which is <code>sc-eYdvao kvdWiq</code>. Ultimately i would want to copy all the 20+ href and export them out to a csv file. Appreciate any help possible.

As per the given HTML: <pre class="prettyprint"><code> <a href="https://www.iproperty.com.my/property/setia-eco-park/sale-1653165/">Shah Alam Setia Eco Park, Setia Eco Park</a> </code></pre> As the <code>href</code> attribute is within the <code><a></code> tag ideally you need to move deeper till the <code><a></code> node. So to extract the value of the <code>href</code> attribute you can use either of the following Locator Strategies: <ul> <li> Using <code>css_selector</code>: <pre class="prettyprint"><code>print(driver.find_element_by_css_selector("p.sc-eYdvao.kvdWiq > a").get_attribute('href')) </code></pre> </li> <li> Using <code>xpath</code>: <pre class="prettyprint"><code>print(driver.find_element_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a").get_attribute('href')) </code></pre> </li> </ul> <hr> If you want to extract all the values of the <code>href</code> attribute you need to use <code>find_elements*</code> instead: <ul> <li> Using <code>css_selector</code>: <pre class="prettyprint"><code>print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_css_selector("p.sc-eYdvao.kvdWiq > a")]) </code></pre> </li> <li> Using <code>xpath</code>: <pre class="prettyprint"><code>print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a")]) </code></pre> </li> </ul> <hr> <h3>Dynamic elements</h3> However, if you observe the values of class attributes i.e. <code>sc-eYdvao</code> and <code>kvdWiq</code> ideally those are dynamic values. So to extract the <code>href</code> attribute you have to induce WebDriverWait for the <code>visibility_of_element_located()</code> and you can use either of the following Locator Strategies: <ul> <li> Using <code>CSS_SELECTOR</code>: <pre class="prettyprint"><code>print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a"))).get_attribute('href')) </code></pre> </li> <li> Using <code>XPATH</code>: <pre class="prettyprint"><code>print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a"))).get_attribute('href')) </code></pre> </li> </ul> <hr> If you want to extract all the values of the <code>href</code> attribute you can use <code>visibility_of_all_elements_located()</code> instead: <ul> <li> Using <code>CSS_SELECTOR</code>: <pre class="prettyprint"><code>print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a")))]) </code></pre> </li> <li> Using <code>XPATH</code>: <pre class="prettyprint"><code>print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a")))]) </code></pre> </li> </ul> Note : You have to add the following imports : <pre class="prettyprint"><code>from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC </code></pre>

Python Selenium - get href value

Tags:

python

css-selectors

selenium

xpath

webdriverwait

I am trying to copy the href value from a website, and the html code looks like this:

Click to copy

<p class="sc-eYdvao kvdWiq">  <a href="https://www.iproperty.com.my/property/setia-eco-park/sale-   1653165/">Shah Alam Setia Eco Park, Setia Eco Park  </a> </p>

I've tried driver.find_elements_by_css_selector(".sc-eYdvao.kvdWiq").get_attribute("href") but it returned 'list' object has no attribute 'get_attribute'. Using driver.find_element_by_css_selector(".sc-eYdvao.kvdWiq").get_attribute("href") returned None. But i cant use xpath because the website has like 20+ href which i need to copy all. Using xpath would only copy one.

If it helps, all the 20+ href are categorised under the same class which is sc-eYdvao kvdWiq.

Ultimately i would want to copy all the 20+ href and export them out to a csv file.

Appreciate any help possible.

291

asked Feb 25 '19 08:02

Eric Choi

2 Answers

You want driver.find_elements if more than one element. This will return a list. For the css selector you want to ensure you are selecting for those classes that have a child href

Click to copy

elems = driver.find_elements_by_css_selector(".sc-eYdvao.kvdWiq [href]") links = [elem.get_attribute('href') for elem in elems]

You might also need a wait condition for presence of all elements located by css selector.

Click to copy

elems = WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".sc-eYdvao.kvdWiq [href]")))

answered Sep 16 '22 15:09

QHarr

As per the given HTML:

Click to copy

<p class="sc-eYdvao kvdWiq">     <a href="https://www.iproperty.com.my/property/setia-eco-park/sale-1653165/">Shah Alam Setia Eco Park, Setia Eco Park</a> </p>

As the href attribute is within the <a> tag ideally you need to move deeper till the <a> node. So to extract the value of the href attribute you can use either of the following Locator Strategies:

Using css_selector:

Click to copy

print(driver.find_element_by_css_selector("p.sc-eYdvao.kvdWiq > a").get_attribute('href'))

Using xpath:

Click to copy

print(driver.find_element_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a").get_attribute('href'))

If you want to extract all the values of the href attribute you need to use find_elements* instead:

Using css_selector:

Click to copy

print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_css_selector("p.sc-eYdvao.kvdWiq > a")])

Using xpath:

Click to copy

print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a")])

Dynamic elements

However, if you observe the values of class attributes i.e. sc-eYdvao and kvdWiq ideally those are dynamic values. So to extract the href attribute you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR:

Click to copy

print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a"))).get_attribute('href'))

Using XPATH:

Click to copy

print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a"))).get_attribute('href'))

If you want to extract all the values of the href attribute you can use visibility_of_all_elements_located() instead:

Using CSS_SELECTOR:

Click to copy

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a")))])

Using XPATH:

Click to copy

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a")))])

Note : You have to add the following imports :

Click to copy

from selenium.webdriver.support.ui import WebDriverWait      from selenium.webdriver.common.by import By      from selenium.webdriver.support import expected_conditions as EC

answered Sep 17 '22 15:09

undetected Selenium

Related questions
                            
                                How to convert ctypes' c_long to Python's int?
                            
                                Directory transfers with Paramiko
                            
                                How do I use Django's logger to log a traceback when I tell it to?
                            
                                Suds over https with cert
                            
                                How to retrieve python list of SQLAlchemy result set? [duplicate]
                            
                                Sort a list with a custom order in Python
                            
                                summing the number of occurrences per day pandas
                            
                                Why only one warning in a loop?
                            
                                How to access current location of any user using python [closed]
                            
                                How to convert datetime.date.today() to UTC time?
                            
                                How can I have autocomplete for python libraries in sublime
                            
                                Python: source code string cannot contain null bytes
                            
                                How to update the value of a key in a dictionary in Python?
                            
                                TemplateNotFound error when running simple Airflow BashOperator
                            
                                record the computation time for each epoch in Keras during model.fit()
                            
                                python class instance variables and class variables
                            
                                Dump in PyYaml as utf-8
                            
                                Python 3 Map function is not Calling up function
                            
                                pandas data frame transform INT64 columns to boolean
                            
                                matplotlib and subplots properties

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Selenium - get href value

Tags:

python

css-selectors

selenium

xpath

webdriverwait

Eric Choi

People also ask

2 Answers

QHarr

Dynamic elements

undetected Selenium

Recent Activity

Donate For Us