I parse HTML with Python.
After parsing I search for some elements in the tree.
I found no easy to use way to find elements in the tree up to now. XPath is available, but I prefer a familiar way.
Is there a way to use selectors in Python which have a syntax similar to jquery/css selectors?
jQuery uses CSS-style selectors to select parts, or elements, of an HTML page. It then lets you do something with the elements using jQuery methods, or functions. To use one of these selectors, type a dollar sign and parentheses after it: $() . This is shorthand for the jQuery() function.
.is( selector )Returns: Boolean Description: Check the current matched set of elements against a selector, element, or jQuery object and return true if at least one of these elements matches the given arguments.
The jQuery Object: The Wrapped Set: Selectors return a jQuery object known as the "wrapped set," which is an array-like structure that contains all the selected DOM elements. You can iterate over the wrapped set like an array or access individual elements via the indexer ($(sel)[0] for example).
exists Property. Returns true if the Selector query matches at least one element. Selectors can match any number of DOM elements — from zero to infinity. Use the exists property to determine if matching elements exist.
The following selector: $("div") only selects the first div element in the HTML document.
The :disabled selector selects all disabled form elements.
The val( ) method gets the input value of the first matched element. Q 18 - Which of the following jQuery method set the value of an element?
BeautifulSoup
has CSS selectors support built-in:
>>> from bs4 import BeautifulSoup
>>> from urllib2 import urlopen
>>> soup = BeautifulSoup(urlopen("https://google.com"))
>>> soup.select("input[name=q]")
[<input autocomplete="off" class="lst" maxlength="2048" name="q" size="57" style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top" title="Google Search" value=""/>]
There is also cssselect
package that you can use in combination with lxml
.
Note that there are certain limitations in how CSS selectors work in BeautifulSoup
- lxml
+csselect
support more CSS selectors:
This is all a convenience for users who know the CSS selector syntax. You can do all this stuff with the Beautiful Soup API. And if CSS selectors are all you need, you might as well use lxml directly: it’s a lot faster, and it supports more CSS selectors. But this lets you combine simple CSS selectors with the Beautiful Soup API.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With