Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access website - WWW::Mechanize

Tags:

html

perl

I try to use the code as below to get the website htm source and it works. However, I cannot get the result when I visit the website http://reserve.apple.com/WebObjects/ProductReservation.woa/wa/reserveProduct by using code as below. But, I can access this page by using browser properly. Would you give me some hints or tips to fix this problem? Thank you.

#!/usr/bin/perl

use strict;
use warnings;

# create a new browser
use WWW::Mechanize;
my $browser = WWW::Mechanize->new();

# tell it to get the main page

my $sURL = 'http://www.apple.com';

#my $sURL = 'http://reserve.apple.com/WebObjects/ProductReservation.woa/wa/reserveProduct';

$browser->get($sURL);

print $browser->content;

exit(0);
like image 214
Tommy Liu Avatar asked Nov 13 '11 10:11

Tommy Liu


People also ask

Does mechanize use a real browser?

mechanize doesn't use real browsers - it is a tool for programmatic web-browsing.

How do I log into mechanize in Python?

from webbot import Browser web = Browser() web. go_to('google.com') web. click('Sign in') web. type('[email protected]' , into='Email') web.

How do I install mechanize?

To install for development: git clone https://github.com/python-mechanize/mechanize.git cd mechanize pip3 install -e . To install manually, simply add the mechanize sub-directory somewhere on your PYTHONPATH .

What is mechanize in Python?

The mechanize module in Python is similar to perl WWW:Mechanize. It gives you a browser like object to interact with web pages. Here is an example on how to use it in a program.


1 Answers

It's a strange behavior, but site at url you want to retrieve requires following headers to be defined: Accept, Accept-Encoding, Accept-Language, Accept-Charset, Cookie.

Otherwise server does not respond at all.

You can easy do this just inserting following code before your "get" request:

$browser->add_header(
    "Accept"          => "",
    "Accept-Encoding" => "",
    "Accept-Language" => "",
    "Accept-Charset"  => "",
    "Cookie"          => ""
);

Instead of empty fields you can insert some real values, but this works too.

like image 115
yko Avatar answered Oct 23 '22 04:10

yko