I'm reading up on consistency models but can't seem to understand the concept of causality in distributed systems. I've googled quite a lot but don't find a good explanation of the concept. People usually explain why causality is a good thing, but what is the basic concept?
Assuming you are asking about the basic notion of causal relationships among events in distributed systems, the following may help get you on the right track.
In the absence of perfectly synchronised clocks shared by all processes of a distributed system, Leslie Lamport introduced the notion of Logical Clocks. A Logical Clock affords the establishment of a partial order over the events occurring in a distributed system via the so-called happened-before relationship, a causal relationship.
To illustrate a bit further, events on the same machine can be ordered by relying on the local clock. However, this is not generally an option for events that cross process-boundaries. In particular, we use the following insight to establish a causal relationship over message passing events in the system: send(m)
at process p
occurs before receive(m)
at process q
. This enables us to establish a causal relationship among these events.
I am not sure how helpful my explanation is, but, if you have not already done so, Leslie Lamport's original paper Time, Clocks, and the Ordering of Events in a Distributed System should help clear things up for you. Next, you may want to look at Spanner: Google's Globally Distributed Database for a creative way to deal with the issue of time in a distributed system (TrueTime).
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With