I'm trying to fill out a form containing a textarea element. I'm using Python with the BeautifulSoap and mechanize modules (stuck on 2.6.5 on FreeBSD 8.1 with the latest modules in the FreeBSD repository: BeautifulSoup 3.1.0.1 and mechanize 0.2.1).
The problem with BeautifulSoap is it doesn't properly set textarea contents (I can try soup.textarea.insert(0, "FOO")
or even soup.textarea.contents = "FOO"
, but once I check the current value with soup.textarea
, I still see the old HTML tags with no content between them:
<textarea name="classified_description" class="classified_textarea_text"></textarea>
The problem with mechanize is it only seems to operate on true forms. Per the HTML I'm parsing below, this is not really a form, but rather a set of divs with input items inside.
How can I use Python or either of these modules to set the value of this textarea element?
<div class="classified_field">
<div class="classified_input_label">Description</div>
<div class="classified_textarea_div">
<textarea name="classified_description" id="classified_description" class="classified_textarea_text"></textarea>
</div>
<div class="site_clear"></div>
</div>
I'd tried Vladimir's technique below, and while it works with his example, it does not work in my production code for some reason. I'm able to use .find()
to get the textarea
, but the .insert()
is giving me grief. Here's what I have so far:
>>> soup.find('textarea', {'name': 'classified_description'})
<textarea name="classified_description" class="classified_textarea_text"></textarea>
>>> soup.find('textarea', {'name': 'classified_description'}).insert(0, "some text here")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.6/site-packages/BeautifulSoup.py", line 233, in insert
newChild.nextSibling.previousSibling = newChild
AttributeError: 'unicode' object has no attribute 'previousSibling'
>>>
Anyone know why this would through the unicode error? Clearly my soup
object is not just a unicode string because I successfully use .find
.
SOLUTION:
Vladimir's solution is correct, but it's possible for real-world HTML to generate a malformed start tag
error in BeautifulSoup 3.1 (official reason here). After downgrading to BeautifulSoup 3.0.8, everything worked fine. When I posted the initial question, I had to do some jury rigging to get mechanize to read()
into the BeautifulSoup object so as not to geht the malformed start tag
error. This caused a uencode sting to be created instead of a BeautifulSoup object. Correcting my mechanize code with the older BeautifulSoup has caused the desired behavior.
Here is an example using BeautifulSoup:
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('<textarea name="classified_description"></textarea>')
soup.find('textarea', {'name': 'classified_description'}).insert(0, 'value')
assert str(soup) == '<textarea name="classified_description">value</textarea>'
BeautifulSoup documentation on modifying the parse tree describes such transformations in details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With