Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Playwright for Python, how do I select (or find) an element?

I'm trying to learn the Python version of Playwright. See here

I would like to learn how to locate an element, so that I can do things with it. Like printing the inner HTML, clicking on it and such.

The example below loads a page and prints the HTML

from playwright import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.newPage()
    page.goto('http://whatsmyuseragent.org/')
    print(page.innerHTML("*"))
    browser.close()

This page contains an element

<div class="user-agent">
    <p class="intro-text">Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4238.0 Safari/537.36</p>
</div>

Using Selenium, I could locate the element and print it's content like this

elem = driver.find_element_by_class_name("user-agent")
print(elem)
print(elem.get_attribute("innerHTML"))

How can I do the same in Playwright?

#UPDATE# - Note if you want to run this in 2021+ that current versions of playwright have changed the syntax from CamelCase to snake_case.

like image 703
576i Avatar asked Oct 11 '20 11:10

576i


People also ask

What is playwright in Python?

Playwright is a Python library to automate Chromium, Firefox and WebKit browsers with a single API. Playwright delivers automation that is ever-green, capable, reliable and fast. See how Playwright is better.

What is Page in playwright?

Page provides methods to interact with a single tab in a Browser, or an extension background page in Chromium. One Browser instance might have multiple Page instances.


2 Answers

The accepted answer does not work with the newer versions of Playwright. (Thanks @576i for pointing this out)

Here is the Python code that works with the newer versions (tested with version 1.5):

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto('http://whatsmyuseragent.org/')
    ua = page.query_selector(".user-agent");
    print(ua.inner_html())
    browser.close()

To get only the text, use the inner_text() function.

print(ua.inner_text())
like image 60
Upendra Avatar answered Nov 09 '22 17:11

Upendra


You can use the querySelector function, and then call the innerHTML function:

handle = page.querySelector(".user-agent")
print(handle.innerHTML())
like image 30
hardkoded Avatar answered Nov 09 '22 17:11

hardkoded