I have written a program in Perl which makes use of multi threading. I am using this program to understand how multi threading is implemented in Perl.
First a brief overview of what the program intends to do: It will read a list of URLs from a text file, one at a time. For each URL, it will call a subroutine (passing the URL as a parameter) and send an HTTP HEAD request to it. Once it receives the HTTP Response headers, it will print the Server Header field from the response.
For each URL, it starts a new thread which calls the above subroutine.
The problem: The main issue is that the program crashes intermittently at times. It runs properly the other times. It appears to be unreliable code and I am sure there is a way to make it work reliably.
The code:
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use WWW::Mechanize;
no warnings 'uninitialized';
open(INPUT,'<','urls.txt') || die("Couldn't open the file in read mode\n");
print "Starting main program\n";
my @threads;
while(my $url = <INPUT>)
{
chomp $url;
my $t = threads->new(\&sub1, $url);
push(@threads,$t);
}
foreach (@threads) {
$_->join;
}
print "End of main program\n";
sub sub1 {
my $site = shift;
sleep 1;
my $mech = WWW::Mechanize->new();
$mech->agent_alias('Windows IE 6');
# trap any error which occurs while sending an HTTP HEAD request to the site
eval{$mech->head($site);};
if($@)
{
print "Error connecting to: ".$site."\n";
}
my $response = $mech->response();
print $site." => ".$response->header('Server'),"\n";
}
Questions:
How can I make this program work reliably and what is the reason for sporadic crashes?
What is the purpose of calling the join method of the thread object?
As per the documentation at the link below, it will wait for the thread execution to complete. Am I invoking the join method correctly?
http://perldoc.perl.org/threads.html
If there are any good programming practices which I must include in the above code, please let me know.
Do I need to call sleep() exclusively in the code or is it not required?
In C, we would call Sleep() after calling CreateThread() to begin the execution of the thread.
Regarding the crash: When the above Perl code crashes unexpectedly and sporadically, I get the error message: "Perl command line interpreter has stopped working"
Details of the crash:
Fault Module Name: ntdll.dll
Exception Code: c0000008
The above exception code corresponds to: STATUS_INVALID_HANDLE
Maybe this corresponds to invalid handle of the thread.
Details of my Perl Installation:
Summary of my perl5 (revision 5 version 14 subversion 2) configuration:
Platform:
osname=MSWin32, osvers=5.2, archname=MSWin32-x86-multi-thread
useithreads=define
Details of the OS: Win 7 Ultimate, 64-bit OS.
Hope this information would be sufficient to find the root cause of the issue and correct the code.
There is nothing wrong with your code. It may be that your expectations are a little too high.
Perl's threads are implemented by creating several interpreter instances within the same operating system process. This isolates the Perl code in each thread from all the others (it's share-nothing). What it doesn't (and can't) do is isolate code that's not under perl's control. That is, any module with a component written in C. For example, a quick look at WWW::Mechanize shows that it has the ability to use zlib for compression, if it's installed. If that gets used, and that C code is not sufficiently thread-safe, that can be a possibly-crashing problem. So if you want to be sure that your Perl application will work well under threads, you have to go through all modules it uses (and all the modules they use) and check that they either have no non-Perl parts or that those parts are thread-safe. For most nontrivial programs, that's an unreasonable amount of work (or an unreasonable limitation on what CPAN modules you can use).
Which is probably a large part of the reason why threads aren't used all that much in Perl.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With