Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup can't find a tag by its class

Here is the part of the web page:

 <div class="MPinfo">
     <ul class="frontList">
         <li>some text</li>
         <li>some text</li>
         <li>some text</li>
         <li>some text</li>
         <li>some text</li>
         <li>some text
             <a href="/some_local_link/8976">some text</a>;
             <a href="/some_local_link/8943">some text</a>;
         </li>
         <li>E-mail: 
             <a href="mailto:[email protected]">[email protected]</a>
         </li>
     </ul>
 </div>

I am trying to get the div by its class and then extract the email link just to email itself like: [email protected]

page = urllib.urlopen(link)
soup = BeautifulSoup(page.read())
print soup.find('div', attrs={'class': 'MPinfo'})

I have tried several ways to get the div but it returns empty list or None

like image 201
Victor Nikolov Avatar asked Nov 09 '22 17:11

Victor Nikolov


1 Answers

You can select all li under the div, it will be a list, so you can select last li element like [-1]

>>> soup.find("div",attrs={"class":"MPinfo"}).find_all("li")[-1].a.text
'[email protected]'
like image 83
helloiamsinan Avatar answered Nov 14 '22 21:11

helloiamsinan