I'm trying to write my first Ruby program, but have a problem. The code has to download 32 MP3 files over HTTP. It actually downloads a few, then times-out.
I tried setting a timeout period, but it makes no difference. Running the code under Windows, Cygwin and Mac OS X has the same result.
This is the code:
require 'rubygems'
require 'open-uri'
require 'nokogiri'
require 'set'
require 'net/http'
require 'uri'
puts "\n Up and running!\n\n"
links_set = {}
pages = ['http://www.vimeo.com/siai/videos/sort:oldest',
'http://www.vimeo.com/siai/videos/page:2/sort:oldest',
'http://www.vimeo.com/siai/videos/page:3/sort:oldest']
pages.each do |page|
doc = Nokogiri::HTML(open(page))
doc.search('//*[@href]').each do |m|
video_id = m[:href]
if video_id.match(/^\/(\d+)$/i)
links_set[video_id[/\d+/]] = m.children[0].to_s.split(" at ")[0].split(" -- ")[0]
end
end
end
links = links_set.to_a
p links
cookie = ''
file_name = ''
open("http://www.tubeminator.com") {|f|
cookie = f.meta['set-cookie'].split(';')[0]
}
links.each do |link|
open("http://www.tubeminator.com/ajax.php?function=downloadvideo&url=http%3A%2F%2Fwww.vimeo.com%2F" + link[0],
"Cookie" => cookie) {|f|
puts f.read
}
open("http://www.tubeminator.com/ajax.php?function=convertvideo&start=0&duration=1120&size=0&format=mp3&vq=high&aq=high",
"Cookie" => cookie) {|f|
file_name = f.read
}
puts file_name
Net::HTTP.start("www.tubeminator.com") { |http|
#http.read_timeout = 3600 # 1 hour
resp = http.get("/download-video-" + file_name)
open(link[1] + ".mp3", "wb") { |file|
file.write(resp.body)
}
}
end
puts "\n Yay!!"
And this is the exception:
/Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:140:in `rescue in rbuf_fill': Timeout::Error (Timeout::Error)
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:134:in `rbuf_fill'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:116:in `readuntil'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:126:in `readline'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:2138:in `read_status_line'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:2127:in `read_new'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:1120:in `transport_request'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:1106:in `request'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:312:in `block in open_http'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:564:in `start'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:306:in `open_http'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:767:in `buffer_open'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:201:in `catch'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:669:in `open'
from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:33:in `open'
from test.rb:38:in `block in <main>'
from test.rb:37:in `each'
from test.rb:37:in `<main>'
I'd also appreciate your comments on the rest of the code.
Timeouts on http. request() takes a timeout option. Its documentation says: timeout <number> : A number specifying the socket timeout in milliseconds. This will set the timeout before the socket is connected.
Net::ReadTimeout is raised when a chunk of data cannot be read within a specified amount of time.
The read timeout is the timeout on waiting to read data1. If the server (or network) fails to deliver any data <timeout> seconds after the client makes a socket read call, a read timeout error will be raised.
The HyperText Transfer Protocol (HTTP) 408 Request Timeout response status code means that the server would like to shut down this unused connection. It is sent on an idle connection by some servers, even without any previous request by the client.
For Ruby 1.8 I used this to solve my time-out issues. Extending the Net::HTTP class in my code and re-initialized with default parameters including an initialization of my own read_timeout
should keep things sane I think.
require 'net/http'
# Lengthen timeout in Net::HTTP
module Net
class HTTP
alias old_initialize initialize
def initialize(*args)
old_initialize(*args)
@read_timeout = 5*60 # 5 minutes
end
end
end
Your timeout isn't in the code you set the timeout for. It's here, where you use open-uri:
open("http://www.tubeminator.com/ajax.php?function=downloadvideo&url=http%3A%2F%2Fwww.vimeo.com%2F" + link[0],
You can set a read timeout for open-uri like so:
#!/usr/bin/ruby1.9
require 'open-uri'
open('http://stackoverflow.com', 'r', :read_timeout=>0.01) do |http|
http.read
end
# => /usr/lib/ruby/1.9.0/net/protocol.rb:135:in `sysread': \
# => execution expired (Timeout::Error)
# => ...
# => from /tmp/foo.rb:5:in `<main>'
:read_timeout
is new for Ruby 1.9 (it's not in Ruby 1.8). 0 or nil means "no timeout."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With