Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Perl's threading system work?

Perl's documentation says: Since Perl 5.8, thread programming has been available using a model called interpreter threads which provides a new Perl interpreter for each thread

Using ps -Lm <pid> with the program below I can see that threads run in parallel, i.e., they are being run at the same time in different cores. But even when there are 4 threads (3 and the main) ps aux shows only one Perl process.

  1. Does this mean that there is a whole Perl interpreter on each thread?
  2. Are Perl threads mapped to system threads?
  3. If 2 is true, how is possible to have multiple Perl interpreters within a single process?
use threads;

$thr = threads->new(\&sub1);
$thr2 = threads->new(\&sub1);
$thr3 = threads->new(\&sub1);

sub sub1 { 
      $i = 0;
      while(true){
        $i = int(rand(10)) + $i;
      }
}


$thr->join;
like image 710
user454322 Avatar asked Sep 21 '12 18:09

user454322


2 Answers

"Perl interpreter" refers to the environment in which the Perl code executes. From a user's perspective, that's mostly the symbol table and the globals therein, but it also includes a slew of internal variables (e.g. those used during parsing, the current op, etc).

  1. Yes, there's a Perl interpreter for each thread.

  2. Yes, Perl threads are system threads.

  3. Think of "Perl interpreter" as a class of which you can make any number of instances.* Perl refers to this as Multiplicity. See perlembed for how to embed a Perl interpreter in your application.


* — Requires the use of -Dusemulitplicity when building Perl, which is implied by -Dusethreads, which is how thread support is added to Perl. Otherwise, a whole bunch of globals are used instead of a "class".

like image 194
ikegami Avatar answered Oct 17 '22 01:10

ikegami


To amplify ikegami's answer to your third question, Perl creates a complete copy the entire state of the interpreter for each operating system thread. This means all the data and code are copied. On the down side, this makes creating threads slow and Perl threads are memory hungry.

On the up side, threads are isolated from each other which makes it much easier to write thread safe code. For example, most modules are inherently thread safe without the author having to do anything special or think about threads at all.

This is Perl's second thread implementation. The first, 5.005 threads, was a more traditional threading model where threads shared code and global variables. It didn't work very well. Worse, it rendered most CPAN modules useless as their uncoordinated global variables clashed with each other amongst the various threads.

How it's possible is a thing called "multiplicity" which ikegami mentioned and explained. This originally sprang out of the desire to embed a Perl interpreter in another C or C++ program. It necessitated changing how Perl works so it isolates all its global data (global variables and compiled code) per interpreter object, rather than assuming its the only Perl interpreter running in the process. From there multiplicity, multiple Perl interpreters within a Perl interpreter, was used to emulate fork on Windows. Finally 5.6 threads built on top of that extensive work.

like image 34
Schwern Avatar answered Oct 17 '22 02:10

Schwern