BeautifulSoup, findAll after findAll?

Question

I'm pretty new to Python and mainly need it for getting information from websites. Here I tried to get the short headlines from the bottom of the website, but cant quite get them.

from bfs4 import BeautifulSoup
import requests

url = "http://some-website"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")

nachrichten = soup.findAll('ul', {'class':'list'})

Now I would need another findAll to get all the links/a from the var "nachrichten", but how can I do this ?

Padraic Cunningham · Accepted Answer

Use a css selector with select if you want all the links in a single list:

anchors = soup.select('ul.list a')

If you want individual lists:

anchors = [ ul.find_all(a) for a in soup.find_all('ul', {'class':'list'})]

Also if you want the hrefs you can make sure you only find the anchors with href attributes and extract:

hrefs = [a["href"] for a in soup.select('ul.list a[href]')]

With find_all set href=True i.e ul.find_all(a, href=True) .

Sandeep · Answer

from bs4 import BeautifulSoup
import requests
url = "http://www.n-tv.de/ticker/"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
nachrichten = soup.findAll('ul', {'class':'list'})
links = []
for ul in nachrichten:
    links.extend(ul.findAll('a'))
print len(links)

Hope this solves your problem and I think the import is bs4. I never herd of bfs4

BeautifulSoup, findAll after findAll?

Tags:

python

beautifulsoup

python-requests

MusicPlay3r

2 Answers

Padraic Cunningham

Sandeep

Recent Activity

Donate For Us

BeautifulSoup, findAll after findAll?

Tags:

python

beautifulsoup

python-requests

MusicPlay3r

2 Answers

Padraic Cunningham

Sandeep

Related questions

Recent Activity

Donate For Us