I'm using the .get_data() method with mechanize, which appears to print out the html that I want. I also check the type of what it prints out, and the type is 'str'.
But when I try to parse the str with BeautifulSoup, I get the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-163-11c061bf6c04> in <module>()
7 html = get_html(first[i],last[i])
8 print type(html)
----> 9 print parse_page(html)
10 # l_to_store.append(parse_page(html))
11 # hfb_data['l_to_store']=l_to_store
<ipython-input-161-bedc1ba19b10> in parse_hfb_page(html)
3 parse html to extract info in connection with a particular person
4 '''
----> 5 soup = BeautifulSoup(html)
6 for el in soup.find_all('li'):
7 if el.find('span').contents[0]=='Item:':
TypeError: 'module' object is not callable
What exactly is 'module', and how do I get what get_data() returns into html?
When you import BeatufilulSoup like this:
import BeautifulSoup
You are importing the module which contains classes, functions etc. In order to instantiate a BeautifulSoup class instance form the BeautifulSoup module you need to either import it or use the full name including the module prefix like yonili suggests in the comment above:
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)
or
import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup(html)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With