Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting Information from a Tuple (Python)

I'm currently using the httplib library in Python 2.7 to obtain some headers from a website to establish a) the filesize of a download and b) the last modified date of the file. I've used some online tools and these details do exist.

I'm currently scripting my Python code and it appears to work correctly bringing back the required information. Nonetheless, the response containing the header information is a list containing a number of tuples. A sample of the response is below:-

[('content-length', '2501479'),
 ('accept-ranges', 'bytes'),
 ('vary', 'Accept-Encoding'),
 ('server', 'off'),
 ('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
 ('etag', '"2c8171a-262b67-4afb368edfffc"'),
 ('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
 ('content-type', 'text/plain')]

What I am looking to do is strip out basically the file size ("2501479") and the date ("Thu, 20 Oct 2011 04:30:01 GMT"). Any ideas how I can go about doing this? I originally tried variable[0] but this returns "'content-length', '2501479'". How can I return the filesize solely (in theory the second part of the first tuple in the list!).

like image 312
thefragileomen Avatar asked Oct 20 '11 16:10

thefragileomen


2 Answers

First, convert the tuples into a dict, and then convert the value to int to get a number:

response_tupels = [('content-length', '2501479'), ('accept-ranges', 'bytes'),]
response = dict(response_tupels)
try:
  content_length = int(response['content-length'])
except KeyError:
  raise # Handle missing content-length here
like image 42
phihag Avatar answered Oct 05 '22 06:10

phihag


First, you can make it a little easier to work with by turning your list of tuples into a dictionary:

>>> headers = [('content-length', '2501479'),
...  ('accept-ranges', 'bytes'),
...  ('vary', 'Accept-Encoding'),
...  ('server', 'off'),
...  ('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
...  ('etag', '"2c8171a-262b67-4afb368edfffc"'),
...  ('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
...  ('content-type', 'text/plain')]
>>> 
>>> headers = dict(headers)
>>> int(headers['content-length'])
2501479

For the date, I would turn it into a datetime object using the email.utils.parsedate function:

>>> import email.utils
>>> email.utils.parsedate(headers['date'])
(2011, 10, 20, 16, 1, 11, 0, 1, -1)
like image 177
jterrace Avatar answered Oct 05 '22 07:10

jterrace