Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting the href of <a> tag which is in <li>

How to get the href of the all the tag that is under the class "Subforum" in the given code?

<li class="subforum">
<a href="Link1">Link1 Text</a>
</li>
<li class="subforum">
<a href="Link2">Link2 Text</a>
</li>
<li class="subforum">
<a href="Link3">Link3 Text</a>
</li>

I have tried this code but obviously it didn't work.

Bs = BeautifulSoup(requests.get(url).text,"lxml")
Class = Bs.findAll('li', {'class': 'subforum"'})
for Sub in Class:
    print(Link.get('href'))
like image 758
Hashik Avatar asked Dec 04 '22 23:12

Hashik


1 Answers

The href belongs to a tag, not li tag, use li.a to get a tag

Document: Navigating using tag names

import bs4

html = '''<li class="subforum">
 <a href="Link1">Link1 Text</a>
 </li>
 <li class="subforum">
<a href="Link2">Link2 Text</a>
</li>
<li class="subforum">
<a href="Link3">Link3 Text</a>
</li>`<br>'''

soup = bs4.BeautifulSoup(html, 'lxml')
for li in soup.find_all(class_="subforum"):
    print(li.a.get('href'))

out:

Link1
Link2
Link3

Why use class_:

It’s very useful to search for a tag that has a certain CSS class, but the name of the CSS attribute, class, is a reserved word in Python. Using class as a keyword argument will give you a syntax error.As of Beautiful Soup 4.1.2, you can search by CSS class using the keyword argument class_.

like image 51
宏杰李 Avatar answered Dec 11 '22 17:12

宏杰李