I'm new to lxml and I'm trying to figure how to rewrite links using iterlinks().
import lxml.html
html = lxml.html.document_fromstring(doc)
for element, attribute, link, pos in html.iterlinks():
    if attibute == "src":
         link = link.replace('foo', 'bar')
print lxml.html.tostring(html)
However, this doesn't actually replace the links. I know I can use .rewrite_links, but iterlinks provides more information about each link, so I would prefer to use this.
Thanks in advance.
Instead of just assigning a new (string) value to the variable name link, you have to alter the element itself, in this case by setting its src attribute:
new_src = link.replace('foo', 'bar') # or element.get('src').replace('foo', 'bar')
element.set('src', new_src)
Note that - if you know which "links" you are interested in, for example, only img elements - you can also get the elements by using .findall() (or xpath or css selectors) instead of using .iterlinks().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With