Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why might Perl allow for http websites using TOR but not https?

Tags:

https

perl

tor

I am having difficulty using perl to visit a website via TOR if it is an https site but not if it is an http site.

#!/usr/bin/perl
use strict;

use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;

my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http', 'https'], 'socks://localhost:9150');
$mech->get("https://www.google.com");

I am receiving the error message: Error GETing https://www.google.com: Status read failed: Bad file descriptor at line 10," where line i10 is the last line of the program.

In the TOR browser, I can successfully view: "https://www.google.com" with a port of 9150. I am using ActivePerl 5.16.2; Vadalia 0.2.21 and Tor 0.2.3.25. I have a Windows machine and my primary internet browser is Mozilla.

I have tried installing packages with the commands:

cpan LWP::UserAgent
ppm install LWP::Protocol::https
cpan LWP::Protocol::https
ppm install LWP::Protocol::socks
cpan LWP::Protocol::socks
ppm install Mozilla::CA
ppm install IO::Socket::SSL
ppm install Crypt::SSLeay
cpan Crypt::SSLeay

Thank you for any help! Please let me know whether there is any further information that I can provide.

like image 648
paso Avatar asked Mar 21 '13 15:03

paso


3 Answers

Time ago, i'd found the way to go throught https sites with Tor using WWW::Curl::Easy to fetch those kind of sites, because using LWP i found the same problems. After that i save all html in files and parsing them using WWW::Mechanzie or HTML::TreeBuilder.

If you want more interactivity with site like post forms , etc. This solutions may be more tedious because you'll need to interact with curl.


package Curl;

use warnings;
use WWW::Curl::Easy;
use WWW::UserAgent::Random;


my $curl = WWW::Curl::Easy->new;
my $useragent = rand_ua("browsers");
my $host = 'localhost';
my $port = '9070';

my $timeout = '20';
my $connectTimeOut= '20';


&init;


sub get
{
        my $url = shift;

        $curl->setopt(CURLOPT_URL, $url);
        my $response_body;
        $curl->setopt(CURLOPT_WRITEDATA,\$response_body);


        my $retcode = $curl->perform;

        if ($retcode == 0) {
                print("Transfer went ok Http::Code = ".$curl->strerror($retcode)."\n");
                my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
                # judge result and next action based on $response_code


                return \$response_body;
        } else {
                # Error code, type of error, error message
                print("An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n");
                return 0;
        }


}



sub init
{
        #setejem el proxy
        $curl->setopt(CURLOPT_PROXY,"$host:".$port);
        $curl->setopt(CURLOPT_PROXYTYPE,CURLPROXY_SOCKS4);

        #posem les altres dades
        $curl->setopt(CURLOPT_USERAGENT, $useragent);
        $curl->setopt(CURLOPT_CONNECTTIMEOUT, $connectTimeOut);
        $curl->setopt(CURLOPT_TIMEOUT, $timeout);
        $curl->setopt(CURLOPT_SSL_VERIFYPEER,0);
        $curl->setopt(CURLOPT_HEADER,0);
}

Hope this will help you!

like image 183
jordivador Avatar answered Nov 15 '22 04:11

jordivador


Maybe the proxy that you are using is already an HTTPS proxy (ie. CONNECT proxy). In that case this should work (untested):

#!/usr/bin/perl
use strict;

use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;

my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http'], 'socks://localhost:9150');
$mech->proxy(['https'], 'https://localhost:9150'); ### <-- make https go over https-connect proxy

$mech->get("https://www.google.com");
like image 45
Emile Aben Avatar answered Nov 15 '22 03:11

Emile Aben


I cannot find the origin but I fought with that a long time ago. Basically the problem I had was with the imlpementation that LWP::UserAgent used for the https requests.

Possibly this question can help you: How do I force LWP to use Crypt::SSLeay for HTTPS requests?

like image 27
CatOsMandros Avatar answered Nov 15 '22 03:11

CatOsMandros