Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download a zip file through Net::HTTP

Tags:

ruby

I am trying to download the latest.zip from WordPress.org using Net::HTTP. This is what I have got so far:

Net::HTTP.start("wordpress.org/") { |http|
  resp = http.get("latest.zip")
  open("a.zip", "wb") { |file| 
    file.write(resp.body)
  }
  puts "WordPress downloaded"
}

But this only gives me a 4 kilobytes 404 error HTML-page (if I change file to a.txt). I am thinking this has something to do with the URL probably is redirected somehow but I have no clue what I am doing. I am a newbie to Ruby.

like image 999
maetthew Avatar asked Dec 09 '22 10:12

maetthew


2 Answers

My first question is why use Net::HTTP, or code to download something that could be done more easily using curl or wget, which are designed to make it easy to download files?

But, since you want to download things using code, I'd recommend looking at Open-URI if you want to follow redirects. Its a standard library for Ruby, and very useful for fast HTTP/FTP access to pages and files:

require 'open-uri'

open('latest.zip', 'wb') do |fo|
  fo.print open('http://wordpress.org/latest.zip').read
end

I just ran that, waited a few seconds for it to finish, ran unzip against the downloaded file "latest.zip", and it expanded into the directory containing their content.

Beyond Open-URI, there's HTTPClient and Typhoeus, among others, that make it easy to open an HTTP connection and send queriers/receive data. They're very powerful and worth getting to know.

like image 102
the Tin Man Avatar answered Jan 01 '23 04:01

the Tin Man


NET::HTTP doesn't provide a nice way of following redirects, here is a piece of code that I've been using for a while now:

require 'net/http'
class RedirectFollower
  class TooManyRedirects < StandardError; end

  attr_accessor :url, :body, :redirect_limit, :response

  def initialize(url, limit=5)
    @url, @redirect_limit = url, limit
  end

  def resolve
    raise TooManyRedirects if redirect_limit < 0

    self.response = Net::HTTP.get_response(URI.parse(url))

    if response.kind_of?(Net::HTTPRedirection)      
      self.url = redirect_url
      self.redirect_limit -= 1

      resolve
    end

    self.body = response.body
    self
  end

  def redirect_url
    if response['location'].nil?
      response.body.match(/<a href=\"([^>]+)\">/i)[1]
    else
      response['location']
    end
  end
end



wordpress = RedirectFollower.new('http://wordpress.org/latest.zip').resolve
puts wordpress.url
File.open("latest.zip", "w") do |file|
  file.write wordpress.body
end
like image 41
Mike Lewis Avatar answered Jan 01 '23 04:01

Mike Lewis