Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get HTTP headers before downloading with Ruby's OpenUri

I am currently using OpenURI to download a file in Ruby. Unfortunately, it seems impossible to get the HTTP headers without downloading the full file:

open(base_url,
  :content_length_proc => lambda {|t|
    if t && 0 < t
      pbar = ProgressBar.create(:total => t)
  end
  },
  :progress_proc => lambda {|s|
    pbar.progress = s if pbar
  }) {|io|
    puts io.size
    puts io.meta['content-disposition']
  }

Running the code above shows that it first downloads the full file and only then prints the header I need.

Is there a way to get the headers before the full file is downloaded, so I can cancel the download if the headers are not what I expect them to be?

like image 560
ePirat Avatar asked Jul 03 '13 17:07

ePirat


People also ask

How do I add a header to API URL?

Fill out the Create a header fields as follows: In the Name field, enter the name of your header rule (for example, My header ). From the Type menu, select Request, and from the Action menu, select Set. In the Destination field, enter the name of the header affected by the selected action.

How do you download a file in Ruby?

Plain old Ruby The most popular way to download a file without any dependencies is to use the standard library open-uri . open-uri extends Kernel#open so that it can open URIs as if they were files. We can use this to download an image and then save it as a file.

When should I use HTTP headers?

An HTTP header is a field of an HTTP request or response that passes additional context and metadata about the request or response. For example, a request message can use headers to indicate it's preferred media formats, while a response can use header to indicate the media format of the returned body.


2 Answers

You can use Net::HTTP for this matter, for example:

require 'net/http'

http = Net::HTTP.start('stackoverflow.com')

resp = http.head('/')
resp.each { |k, v| puts "#{k}: #{v}" }
http.finish

Another example, this time getting the header of the wonderful book, Object Orient Programming With ANSI-C:

require 'net/http'

http = Net::HTTP.start('www.planetpdf.com')

resp = http.head('/codecuts/pdfs/ooc.pdf')
resp.each { |k, v| puts "#{k}: #{v}" }
http.finish
like image 166
yeyo Avatar answered Oct 10 '22 17:10

yeyo


It seems what I wanted is not possible to archieve using OpenURI, at least not, as I said, without loading the whole file first.

I was able to do what I wanted using Net::HTTP's request_get

Here an example:

http.request_get('/largefile.jpg') {|response|
  if (response['content-length'] < max_length)
    response.read_body do |str|   # read body now
      # save to file
    end
  end
}

Note that this only works when using a block, doing it like:

response = http.request_get('/largefile.jpg')

the body will already be read.

like image 20
ePirat Avatar answered Oct 10 '22 17:10

ePirat