Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pthread Thread objects reset their state

Working recently with the extension pthreads, i discovered an anomaly. I have a simple object with an internal state:

class Sum {
    private $value = 0;
    public function add($inc)  { $this->value += $inc; }
    public function getValue() { return $this->value; }
}

Now i created a Thread class that does something with this object:

class MyThread extends Thread {
    private $sum;

    public function __construct(Sum $sum) {
        $this->sum = $sum;
    }

    public function run(){
        for ($i=0; $i < 10; $i++) {
            $this->sum->add(5);
            echo $this->sum->getValue() . " ";
        }
    }
}

In my main function i created a Sum object, injected it into the thread and started it:

$sum = new Sum();
$thread = new MyThread($sum);
$thread->start();
$thread->join();
echo $sum->getValue();

I expected the result to be 50, because the thread had to increment the value 10 times by 5. But i got a 0!

More curious is that it's not the synchronization back into the main thread that failed but the thread even seems to forget its internal state on the way: The output of the echo inside the run() method is not the expected 5 10 15 20 25 30 35 40 45 50 but 0 0 0 0 0 0 0 0 0 0. Nobody is interfering with the thread - why does it not preserve its state?


Side note: If i do not start the thread but instead call the run()-method directly in the main thread ($thread->run();), the result is still the same. But if i now remove the extends Thread in the class declaration, it works perfectly and returns the expected 5 10 15 20 25 30 35 40 45 50.

like image 741
Francois Bourgeois Avatar asked Jan 28 '13 14:01

Francois Bourgeois


2 Answers

Any object not descended from a pthreads definition will be serialized upon setting it a member of an object descended from pthreads.

Operations like += and [] use pointers internally, serialization is incompatible with pointers for other objects. In the manual on the introduction page, it states that any object intended to be manipulated by multiple contexts should extend Stackable, Thread or Worker, like

<?php
class Sum extends Stackable {
    private $value = 0;
    public function add($inc)  { $this->value += $inc; }
    public function getValue() { return $this->value; }
    public function run(){}
}

class MyThread extends Thread {
    public $sum;

    public function __construct(Sum $sum) {
        $this->sum = $sum;
    }

    public function run(){
        for ($i=0; $i < 10; $i++) {
            $this->sum->add(5);
            echo $this->sum->getValue() . " ";
        }
    }
}

$sum = new Sum();
$thread = new MyThread($sum);
$thread->start();
$thread->join();
echo $sum->getValue();
?>

If Sum weren't using pointers you would have the option of retrieving the reference from the threaded object after join.

These are simple operations, you are not required to synchronize. The only time you should synchronize is when you plan to wait on an object or notify one.

The objects bundled with pthreads are very much more suited to this environment and are never serialized.

Please do read the intro in the manual and all the examples in the methods you wish to utilize to find out exactly what is what, then feel free to ask why :)

I know that users of PHP aren't used to having to do research, but we are pushing the envelope here, you will find there are correct ways to do things an incorrect ways, most of them are documented in examples, and anything thats not I'm sure will be extracted from me on SO and eventually find it's way to the documentation.

I'm not sure if the example you gave was testing out objects in particular, but the code you provided need not be two objects, and shouldn't be two objects either, consider the following:

<?php
class MyThread extends Thread {
    public $sum;

    public function run(){
        for ($i=0; $i < 10; $i++) {
            $this->add(5);

            printf("%d ", $this->sum);
        }
    }

    public function add($num) { $this->sum += $num; }
    public function getValue() { return $this->sum; }
}

$thread = new MyThread();
$thread->start();
$thread->join();
var_dump($thread->getValue());
?>

It may be useful for you to see a couple more features in action with an explanation, so here goes, here's a similar example to yours:

<?php
class MyThread extends Thread {
    public $sum;

    public function __construct() {
        $this->sum = 0;
    }

    public function run(){
        for ($i=0; $i < 10; $i++) {
            $this->add(5);

            $this->writeOut("[%d]: %d\n", $i, $this->sum);
        }

        $this->synchronized(function($thread){
            $thread->writeOut("Sending notification to Process from %s #%lu ...\n", __CLASS__, $thread->getThreadId());
            $thread->notify();
        }, $this);
    }

    public function add($num) { $this->sum += $num; }
    public function getValue() { return $this->sum; }

    /* when two threads attempt to write standard output the output will be jumbled */
    /* this is a good use of protecting a method so that 
        only one context can write stdout and you can make sense of the output */
    protected function writeOut($format, $args = null) {
        $args = func_get_args();
        if ($args) {
            vprintf(array_shift($args), $args);
        }
    }
}

$thread = new MyThread();
$thread->start();

/* so this is synchronization, rather than joining, which requires an actual join of the underlying thread */
/* you can wait for notification that the thread is done what you started it to do */
/* in these simple tests the time difference may not be apparent, but in real complex objects from */
/* contexts populated with more than 1 object having executed many instructions the difference may become very real */
$thread->synchronized(function($thread){
    if ($thread->getValue()!=50) {
        $thread->writeOut("Waiting for thread ...\n");
        /* you should only ever wait _for_ something */
        $thread->wait();
        $thread->writeOut("Process recieved notification from Thread ...\n");
    }
}, $thread);

var_dump($thread->getValue());
?>

This combines some of the more advanced features in some simple examples, and is commented to help you along. On the subject of sharing objects, there's nothing wrong with passing around a Thread object if it contains some functionality and data required in other threads or stackables. You should aim to use as few threads and objects as possible in order to get the job done.

like image 188
Joe Watkins Avatar answered Sep 22 '22 04:09

Joe Watkins


Your problem is that you are accessing the variable from the main thread and from the MyThread thread. The CPU caches the variable and it gets updated in the cache for MyThread but not in the cache for the main thread, so your both threads never see the others thread changes. In Java / C etc. there is the keyword volatile but I don't know if that exists in PHP.

I think you should try to call the methods in sum synchronized ( http://php.net/manual/en/thread.synchronized.php )

For example, instead of:

        $this->sum->add(5);

Call:

        $this->synchronized(function($thread){
            $thread->sum->add(5);
        }, $this);
like image 31
th3falc0n Avatar answered Sep 26 '22 04:09

th3falc0n