I'm trying to get a very basic, and short, basic unordered list <ul> off of Wikipedia. My end goal is to put it into a DataFrame.
My question is, where do I go from here?
In [28]: from bs4 import BeautifulSoup
import urllib2
import requests
from pandas import Series,DataFrame
In [29]: url = "https://en.wikipedia.org/wiki/National_Pro_Grid_League"
In [31]: result = requests.get(url)
In [32]: c = result.content
In [33]: soup = BeautifulSoup(c)
I cant seem to find any answers on this StackOverflow, so I would appreciate any advice anyone could give me.
This is the specific list I'm looking for:
Active teams[edit]
Baltimore Anthem (2015–present)
Boston Iron (2014–present)
DC Brawlers (2014–present)
Los Angeles Reign (2014–present)
Miami Surge (2014–present)
New York Rhinos (2014–present)
Phoenix Rise (2014–present)
San Francisco Fire (2014–present)
First you'll want to find the correct part of the page. You can do this by finding the heading with id="Active_teams_at_league_closing" and then finding the next <ul> element from there.
from bs4 import BeautifulSoup
import requests
url = "https://en.wikipedia.org/wiki/National_Pro_Grid_League"
r = requests.get(url)
soup = BeautifulSoup(r.content)
heading = soup.find(id='Active_teams_at_league_closing')
teams = heading.find_next('ul')
for team in teams:
print(team.string)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With