I imagine this must have a simple answer, but I am struggling: I want to take a url (which outputs json) and get the data in a usable dictionary in python. I am stuck on the last step.
>>> import urllib2 >>> import simplejson >>> req = urllib2.Request("http://vimeo.com/api/v2/video/38356.json", None, {'user-agent':'syncstream/vimeo'}) >>> opener = urllib2.build_opener() >>> f = opener.open(req) >>> f.read() # this works '[{"id":"38356","title":"Forgetfulness - Billy Collins Animated Poetry","description":"US Poet Laureate Billy Collins reads his poem ","url":"http:\\/\\/vimeo.com\\/38356","upload_date":"2006-01-24 15:21:03","thumbnail_small":"http:\\/\\/80.media.vimeo.com\\/d1\\/5\\/47\\/74\\/thumbnail-4774968.jpg","thumbnail_medium":"http:\\/\\/80.media.vimeo.com\\/d1\\/5\\/46\\/85\\/thumbnail-4685118.jpg","thumbnail_large":"http:\\/\\/images.vimeo.com\\/87\\/39\\/873998\\/873998_640x480.jpg","user_name":"smjwt","user_url":"http:\\/\\/vimeo.com\\/smjwt","user_portrait_small":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.30.jpg","user_portrait_medium":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.75.jpg","user_portrait_large":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.100.jpg","user_portrait_huge":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.300.jpg","stats_number_of_likes":"281","stats_number_of_plays":"9173","stats_number_of_comments":23,"duration":"112","width":"320","height":"240","tags":"poetry, poet, online poetry, famous poet, video poetry, modern poetry, famous poem, poetry sites, poetry websites, audio poetry, american poet, animation clips, american poetry, free poetry sites, animation art, free poetry, animated clips, poem, poet laureate"}]' >>> simplejson.load(f) Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/lib/python2.5/site-packages/django/utils/simplejson/__init__.py", line 298, in load parse_constant=parse_constant, **kw) File "/usr/lib/python2.5/site-packages/django/utils/simplejson/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib/python2.5/site-packages/django/utils/simplejson/decoder.py", line 326, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.5/site-packages/django/utils/simplejson/decoder.py", line 344, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
Any ideas where I am going wrong?
The first step in this process is to choose a web scraper for your project. We obviously recommend ParseHub. Not only is it free to use, but it also works with all kinds of websites. With ParseHub, web scraping is as simple as clicking on the data you want and downloading it as an excel sheet or JSON file.
Try
f = opener.open(req) simplejson.load(f)
without running f.read() first. When you run f.read(), the filehandle's contents are slurped so there is nothing left when your call simplejson.load(f)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With