I'm new to scrapy framework and I've seen some tutorial using LinkExtractors
and a few using SgmlLinkExtractor
. I've tried searching for the differences/pros-cons for both, but the results haven't been satisfying.
Can someone tell me the difference between both? When should we use the above extractors?
Thanks!
The problem why you cannot find the references to what SgmlLinkExtractor
is, is that it is now deprecated (related changeset). You can find the SgmlLinkExtractor
definition here - inside the Scrapy 0.24 docs.
And, you should not be using SgmlLinkExtractor
anymore - Scrapy now leaves a single link extractor only - the LxmlLinkExtractor
- the one to which the LinkExtractor
alias points to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With