I'm new to BeautifulSoup in Python and I'm trying to extract dict
from BeautifulSoup.
I've used BeautifulSoup to extract JSON and got beautifulsoup.beautifulsoup
variable soup
.
I'm trying to get values out of soup
, but when I do result = soup.findAll("bill")
I get an empty list []
. How can I extract soup to get dict
result of:
{u'congress': 113,
u'number': 325,
u'title': u'A bill to ensure the complete and timely payment of the obligations of the United States Government until May 19, 2013, and for other purposes.',
u'type': u'hr'}
print type(soup)
print soup
=> result below
BeautifulSoup.BeautifulSoup
{
"bill": {
"congress": 113,
"number": 325,
"title": "A bill to ensure the complete and timely payment of the obligations of the United States Government until May 19, 2013, and for other purposes.",
"type": "hr"
},
"category": "passage",
"chamber": "s"
}
UPDATE
Here is how I got soup
:
from BeautifulSoup import BeautifulSoup
import urllib2
url = urllib2.urlopen("https://www.govtrack.us/data/congress/113/votes/2013/s11/data.json")
content = url.read()
soup = BeautifulSoup(content)
To convert Python JSON string to Dictionary, use json. loads() function. Note that only if the JSON content is a JSON Object, and when parsed using loads() function, we get Python Dictionary object. JSON content with array of objects will be converted to a Python list by loads() function.
JSON String to Python Dictionary To do this, we will use the loads() function of the json module, passing the string as the argument. json. loads(data_JSON) creates a new dictionary with the key-value pairs of the JSON string and it returns this new dictionary.
Parse JSON - Convert from JSON to PythonIf you have a JSON string, you can parse it by using the json.loads() method. The result will be a Python dictionary.
Though Python's BeautifulSoup module was designed to scrape HTML files, it can also be used to parse XML files. In today's professional marketplace, it is useful to be able to change an XML file into other formats, specifically dictionaries, CSV, JSON, and dataframes according to specific needs.
Not very familiar with BeautifulSoup but if you just need to decode JSON
import json
newDictionary=json.loads(str(soup))
You could remove BeautifulSoup
:
import json
import urllib2
url = "https://www.govtrack.us/data/congress/113/votes/2013/s11/data.json"
data = json.load(urllib2.urlopen(url))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With