Why is a monitor implemented in terms of semaphores this way?

Question

I have trouble understanding the implementation of a monitor in terms of semaphores from Operating System Concepts

5.8.3 Implementing a Monitor Using Semaphores

We now consider a possible implementation of the monitor mechanism using semaphores.

For each monitor, a semaphore mutex (initialized to 1) is provided. A process must execute wait(mutex) before entering the monitor and must execute signal(mutex) after leaving the monitor.

Since a signaling process must wait until the resumed process either leaves or waits, an additional semaphore, next, is introduced, initialized to 0. The signaling processes can use next to suspend themselves. An integer variable next_count is also provided to count the number of processes suspended on next. Thus, each external function F is replaced by
wait(mutex);
...
body of F
...
if (next count > 0)
    signal(next);
else
    signal(mutex);
Mutual exclusion within a monitor is ensured.

We can now describe how condition variables are implemented as well. For each condition x, we introduce a semaphore x_sem and an integer variable x_count, both initialized to 0. The operation x.wait() can now be implemented as
x_count++;
if (next_count > 0)
    signal(next);
else
    signal(mutex);
wait(x sem);
x_count--;
The operation x.signal() can be implemented as
if (x_count > 0) {
    next_count++;
    signal(x_sem);
    wait(next);
    next_count--;
}

What does the reason for introducing semaphore next and the count next_count of processes suspended on next mean?

Why are x.wait() and x.signal() implemented the way they are?

Thanks.

Shabirmean · Accepted Answer

------- Note -------

WAIT() and SIGNAL() denote calls on monitor methods
wait() and signal() denote calls to semaphore methods, in the explanation that follows.

------- End of Note -------

I think it is easier if you think in terms of a concrete example. But before that let's first try to understand what a monitor is. As explained in the book a monitor is a Abstract Data Type meaning that it is not a real type which can be used to instantiate a variable. Rather it is like a specification with some rules and guidelines based on which different languages could provide support for process synchronization.

Semaphors were introduced as a software-based solution for achieving synchronization over hardware-based approaches like TestAndSet() or Swap(). Even with semaphores, the programmers had to ensure that they invoke the wait() & signal() methods in the right order and correctly. So, an abstract specification called monitors were introduced to encapsulate all these things related to synchronization as one primitive so simply any process executing inside the monitor will ensure that these methods (semaphore wait and signal) invocations are used accordingly.

With monitors all shared variables and functions (that use the shared variables) are put into the monitor structure and when any of these functions are invoked the monitor implementation takes care of ensuring that the shared resources are protected over mutual exclusion and any issues of synchronization.

Now with monitors unlike semaphores or other synchronization techniques we are not dealing with just one portion of the critical section but many of them in terms of different functions. In addition, we do also have shared variables that are accessed within these functions. For each of the different functions in a monitor to ensure only one of them is executed and no other process is executing on any of the functions, we can use a global semaphore called mutex.

Consider the example of the solution for the dining philosophers problem using monitors below.

monitor dining_philopher
{
     enum {THINKING, HUNGRY, EATING} state[5];
     condition self[5];

     void pickup(int i) {
         state[i] = HUNGRY;
         test(i);

         if (state[i] != EATING)
             self[i].WAIT();
     }

     void putdown(int i) {
         state[i] = THINKING;
         test((i + 4) % 5);
         test((i + 1) % 5);
     }

     void test(int i) {
         if (
              (state[(i + 4) % 5] != EATING) &&
              (state[i] == HUNGRY) &&
              (state[(i + 1) % 5] != EATING)) 
         {
                  state[i] = EATING;
                  self[i].SIGNAL();
         }
     }

     initialization code() {
          for (int i = 0; i < 5; i++)
              state[i] = THINKING;
          }
     }
}

Ideally, how a process might invoke these functions would be in the following sequence:

DiningPhilosophers.pickup(i);
...
  // do somework
...
DiningPhilosophers.putdown(i);

Now, whilst one process is executing inside the pickup() method another might try to invoke putdown() (or even the pickup) method. In order to ensure mutual exclusion we must ensure only one process is running inside the monitor at any given time. So, to handle these cases we have a global semaphore mutex that encapsulates all the invokable (pickup & putdown) methods. So these two methods will be implemented as follows:

     void pickup(int i) {
         // wait(mutex);

         state[i] = HUNGRY;
         test(i);

         if (state[i] != EATING)
             self[i].WAIT();

         // signal(mutex);
     }

     void putdown(int i) {
         // wait(mutex);

         state[i] = THINKING;
         test((i + 4) % 5);
         test((i + 1) % 5);

         // signal(mutex);
     }

Now only one process will be able to execute inside the monitor in any of its methods. Now, with this setup, if Process P1 has executed pickup() (but is yet tp putdown the chopsticks) and then Process P2 (say an adjacent diner) tries to pickup(): since his/her chopsticks (shared resource) is in use, it has to wait() for it to be available. Let's look at the WAIT and SIGNAL implementation of the monitor's conditional variables:

WAIT(){
    x_count++;

    if (next_count > 0)
        signal(next);
    else
        signal(mutex);

    wait(x_sem);
    x_count--;
}

SIGNAL() {
    if (x_count > 0) {
        next_count++;
        signal(x_sem);
        wait(next);
        next_count--;
    }
}

The WAIT implementation of the conditional variables is different from that of the Semaphore's because it has to provide more functionality, like allowing other processes to invoke functions of the monitor (whilst it waits) by releasing the mutex global semaphore. So, when WAIT is invoked by P2 from the pickup() method, it will call signal(mutex) allowing other processes to invoke the monitor methods and call wait(x_sem) on the semaphore specific to the conditional. Now, P2 is blocked here. In addition, the variable x_count keeps track of the number of Processes waiting on the conditional variable (self).

So when P1 invokes putdown(), this will invoke SIGNAL via the test() method. Inside SIGNAL when P1 invokes signal(x_sem) on the chopstick it holds, it must do one additional thing. It must ensure that only one process is running inside the monitor. If it would only call signal(x_sem) then from that point onwards P1 and P2 both would start doing things inside the monitor. To prevent this P1, after releasing its chopstick it will block itself until P2 finishes. To block itself, it uses the semaphore next. And to notify P2 or some other process that there is someone blocked it uses a counter next_count.

So, now P2 would get the chopsticks and before it exits the pickup() method it must release P1 who is waiting on P2 to finish. So now, we must change the pickup() method (and all functions of the monitor) as follows:

     void pickup(int i) {
         // wait(mutex);

         state[i] = HUNGRY;
         test(i);

         if (state[i] != EATING)
             self[i].WAIT();

         /**************
         if (next_count > 0)
             signal(next);
         else
             signal(mutex);
         **************/
     }

     void putdown(int i) {
         // wait(mutex);

         state[i] = THINKING;
         test((i + 4) % 5);
         test((i + 1) % 5);

         /**************
         if (next_count > 0)
             signal(next);
         else
             signal(mutex);
         **************/
     }

So now, before any process exits a function of the monitor, it checks if there are any waiting processes and if so releases them and not the mutex global semaphore. And the last of such waiting processes will release the mutex semaphore allowing new processes to enter into the monitor functions.

I know it's pretty long, but it took some time for me to understand and wanted to put it in writing. I will post it on a blog soon.

If there are any mistakes please let me know.

Best,
Shabir

Why is a monitor implemented in terms of semaphores this way?

Tags:

computer-science

operating-system

synchronization

semaphore

monitor

Tim

1 Answers

Shabirmean

Recent Activity

Donate For Us

Why is a monitor implemented in terms of semaphores this way?

Tags:

computer-science

operating-system

synchronization

semaphore

monitor

Tim

1 Answers

Shabirmean

Related questions

Recent Activity

Donate For Us