Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logical Clocks: Lamport Timestamps

I am currently trying to understand Lamport timestamps. Consider two processes P1 (producing events a1, a2,...) and P2 (producing events b1, b2,...). Let C(e) denote the Lamport timestamp associated with event an e. I created timestamps for each event as described in the Wikipedia article about Lamport timestamps:

enter image description here

According to Wikipedia, the following relation is true for all events e1, e2:

If e1 happened before e2, then C(e1) < C(e2).

Let's look at a1 and b2. Clearly a1 happened before b2, and since C(a1) = 1 and C(b2) = 3, the relation holds true: C(a1) < C(b2).

Problem: The relation doesn't hold true for b3 and a3. Obviously, b3 happened before a3. However, C(b3) = 4 and C(a3) = 3. So C(b3) < C(a3) does not apply.

What did I misunderstand? Help is much appreciated!

like image 454
lkbaerenfaenger Avatar asked Jun 20 '15 18:06

lkbaerenfaenger


1 Answers

Yes, that definition might be slightly confusing. It’s correct what everyone else said so far, however maybe there is one word missing, to understand the system better. - concurrent

If your processes p1 and p2 are running on the same machine, there really isn't too much of a need to use lamport clocks (maybe some extremely specific case). Instead you can just use the clock provided by your OS. But what if p1 and p2 are on computers which are separated by a slow and unreliable network?

Lamport assumes, you can't trust your local clock and you don't have any global state of the distributed system, in which order events on 2 separate computers occurred. That's when you call those events occurring concurrently.

When you'd be debugging the execution of the distributed system and you'd see events a3 and b3 naturally you'd assume, a3 happened before b3. In your specific case you'd now claim, yeah, but that's wrong. However, because the events aren't related, as they didn't communicate with each other, the order is generally assumed to be concurrent and in such a case it doesn't matter which one happened first or second, for the whole execution of the system.

Since computers and networks are working so fast and still very precisely, it's sometimes difficult to comprehend, let's look at the same thing slightly differently:

p1 and p2 are two people living few 100 years ago in two different valleys. They communicate together using pidgins and never talk about when they did a certain task, just what they did. This way, nobody could know, if a3 happened before b3 or the other way around, hence they happened concurrently. Maybe not nobody, god watching from up on p1 and p2 could see it.

Unfortunately, when you have a distributed system, you can't be god and watching p1 and p2 at the same time, just out of the reason that messages from p1 might take longer than from p2. So even though your monitoring system (god) has received the information of b3 before it received the information about a4 it doesn't mean they happened in that order, maybe the packages containing information about a4 took just a longer or slower path.

In the end, there is another thing called vector clocks. Each process has a lamport clock for every process in the system. The key here is, event a would only happen before event b if all lamport clocks of a were smaller or equal those of b. If you'll try it out on your little example, you'll see that no event happened before the other => they are concurrent.

like image 96
peter Avatar answered Sep 19 '22 11:09

peter