Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Straight LXML or PyQuery

Does anyone have experience scraping with straight lxml vs. PyQuery. I just came across the latter recently and was intrigued. I haven't been able to find many comments about the library just yet, so I'm curious as to how robust it is.

I'm familiar with lxml and generally enjoy it. It would be nice, however, to use jQuery selector syntax.

Is the switch worth it?

Thanks!

like image 346
Ben Avatar asked Apr 01 '26 16:04

Ben


2 Answers

lxml supports XPath, which is similar to CSS selectors. Would that meet your needs?

like image 185
hoju Avatar answered Apr 04 '26 07:04

hoju


Only you can answer the question of whether it's worth it.

It simply depends on whether you want to use an extra dependency in order to get jQuery's custom CSS selectors.

Here are the things jQuery adds on top of the standard CSS selectors: http://api.jquery.com/category/selectors/jquery-selector-extensions/

And here is the translation of those selectors to normal CSS selectors in PyQuery: https://bitbucket.org/olauzanne/pyquery/src/c2bf08a8f4e7/pyquery/cssselectpatch.py

I don't see why it should be any less robust than using plain CSS selectors with lxml. It's simply translating special jQuery selectors into CSS selectors.

like image 23
Acorn Avatar answered Apr 04 '26 05:04

Acorn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!