Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect redirect with ruby mechanize

I am using the mechanize/nokogiri gems to parse some random pages. I am having problems with 301/302 redirects. Here is a snippet of the code:

agent = Mechanize.new
page = agent.get('http://example.com/page1')

The test server on mydomain.com will redirect the page1 to page2 with 301/302 status code, therefore I was expecting to have

page.code == "301"

Instead I always get page.code == "200".

My requirements are:

  • I want redirects to be followed (default mechanize behavior, which is good)
  • I want to be able to detect that page was actually redirected

I know that I can see the page1 in agent.history, but that's not reliable. I want the redirect status code also.

How can I achieve this behavior with mechanize?

like image 919
user337620 Avatar asked Jul 06 '13 12:07

user337620


2 Answers

You could leave redirect off and just keep following the location header:

agent.redirect_ok = false
page = agent.get 'http://www.google.com'
status_code = page.code

while page.code[/30[12]/]
  page = agent.get page.header['location']
end
like image 149
pguardiario Avatar answered Oct 05 '22 11:10

pguardiario


I found a way to allow redirects and also get the status code, but I'm not sure it's the best method.

agent = Mechanize.new

# deactivate redirects first
agent.redirect_ok = false

status_code = '200'
error_occurred = false

# request url
begin
  page = agent.get(url)
  status_code = page.code
rescue Mechanize::ResponseCodeError => ex
  status_code = ex.response_code
  error_occurred = true
end

if !error_occurred && status_code != '200' then
  # enable redirects and request the page again
  agent.redirect_ok = true
  page = agent.get(url)
end
like image 39
user337620 Avatar answered Oct 05 '22 12:10

user337620