I am trying to write a script to rewrite links in a Plone ATDocument. When I call getText() and dereference all the UID links by calling portal_transforms.convertTo('text/x-html-safe') the URLs are all rewritten as "http://foo/Plone/..." (literally, "foo", as the domain name). When I save the text with setText() and try to view it in the site the "foo" domain name is still there and is not re-written to the correct domain.
How can I make the HTML passed to setText() understand links to the current site?
If you want to modify the value of the text field you need to get the value by using the raw getter of the field.
>>> item.getRawText()
>>> ...
This returns the value untouched, then you can modify the text and save it.
NOTE:
By default plone is using the UID to handle internal links (check linkintegrity feature), so you probably not get a relative path from the raw getter, but a
../resolveuid/$(UID)like url.
EDIT:
This may help you to rewrite the links.
>>> import re
>>> from lxml import html
>>> resolveuid_re = re.compile('^[./]*resolve[Uu]id/([^/]*)/?(.*)$') # Regex resolving the uid from a path.
Get all links from text
>>> raw_text = obj.getRawText()
>>> dom = html.fromstring(raw_text)
>>> links = dom.xpath('//a/@href')
>>> links
['resolveuid/fbb9304e48b24a30ac7ba31eb5be2cb6']
Get uid(s)
>>> uid = resolveuid_re.match(links[0]).group(1)
>>> uid
fbb9304e48b24a30ac7ba31eb5be2cb6
Now you may find and replace the uid(s), store them and you're done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With