I wrote simple function which handles fetching of the url:
def tender_page_get url, agent
sleep(rand(6)+2)
begin
return agent.get(url).parser
rescue Errno::ETIMEDOUT, Timeout::Error, Net::HTTPNotFound
EYE.debug "--winter sleep #{url}"
puts "-x-#{url}"
sleep(300)
tender_page_get url, agent
rescue => e
puts "-x-#{url}"
EYE.debug "--unknown exception"
EYE.debug "#{url} #{e.inspect}"
end
end
The problem is, even though I am catching Net::HTTPNotFound
in my first rescue block, I still see in my log records like:
--unknown exception
{url} 404 => Net::HTTPNotFound
which means that this exception was caught by the second rescue block. What could be the reason for that?
Mechanize raises a Mechanize::ResponseCodeError for a 404 and not a Net::HTTPNotFound. The to_s on Mechanize::ResponseCodeError looks like this:
def to_s
"#{response_code} => #{Net::HTTPResponse::CODE_TO_OBJ[response_code]}"
end
This returns '404 => Net::HTTPNotFound' which makes it look like this is the exception being raised.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With