I'm just starting out in Python and I'm trying to request the html source code of a site using urllib2. However when I try and get the html content from a site I'm not getting the full html content - there are tags missing. I know they're missing as when I view the site in firebug the code shows up. Is this due to the way I'm requesting the data - or due to the site? If so is there a way in which I can get the full source code of the site in python, and then parse it?
Currently the code I'm using to request the content and the site I'm trying is:
import urllib2
url = 'http://marinetraffic.com/ais/'
response = urllib2.urlopen(url)
html = response.read()
print(html)
Specifically the content between the - div id="map_area" - is missing. Any help/pointers greatly appreciated!
You are getting incomplete data because most of the content on this page is dynamically generated via Javascript...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With