Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python urllib2 Response header

Tags:

I'm trying to extract the response header of a URL request. When I use firebug to analyze the response output of a URL request, it returns:

Content-Type text/html 

However when I use the python code:

urllib2.urlopen(URL).info() 

the resulting output returns:

Content-Type: video/x-flv 

I am new to python, and to web programming in general; any helpful insight is much appreciated. Also, if more info is needed please let me know.

Thanks in advance for reading this post

like image 267
looter Avatar asked Oct 31 '09 06:10

looter


People also ask

What is urllib2 in Python?

urllib2 is a Python module that can be used for fetching URLs. It defines functions and classes to help with URL actions (basic and digest. authentication, redirections, cookies, etc) The magic starts with importing the urllib2 module.

Is urllib2 deprecated?

urllib2 is deprecated in python 3. x. use urllib instaed.

What does Urlopen return?

The data returned by urlopen() or urlretrieve() is the raw data returned by the server. This may be binary data (such as an image), plain text or (for example) HTML. The HTTP protocol provides type information in the reply header, which can be inspected by looking at the Content-Type header.

Which is better Urllib or requests?

True, if you want to avoid adding any dependencies, urllib is available. But note that even the Python official documentation recommends the requests library: "The Requests package is recommended for a higher-level HTTP client interface."


1 Answers

Try to request as Firefox does. You can see the request headers in Firebug, so add them to your request object:

import urllib2  request = urllib2.Request('http://your.tld/...') request.add_header('User-Agent', 'some fake agent string') request.add_header('Referer', 'fake referrer') ... response = urllib2.urlopen(request) # check content type: print response.info().getheader('Content-Type') 

There's also HTTPCookieProcessor which can make it better, but I don't think you'll need it in most cases. Have a look at python's documentation:

http://docs.python.org/library/urllib2.html

like image 121
qingbo Avatar answered Sep 18 '22 14:09

qingbo