Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distributed Concurrency Control

People also ask

What is concurrency control in distributed system?

Concurrency control is the activity of co- ordinating concurrent accesses to a data- base in a multiuser database management system (DBMS). Concurrency control per- mits users to access a database in a multi- programmed fashion while preserving the illusion that each user is executing alone on a dedicated system.

What are the objectives of distributed concurrency control?

Concurrency control manages the transactions simultaneously without letting them interfere with each another. The main objective of concurrency control is to allow many users perform different operations at the same time. Using more than one transaction concurrently improves the performance of system.

Why concurrency control is needed in distributed system?

Concurrency control is a very important issue in distributed database system design. This is because concurrency allows many transactions to be executing simultaneously such that collection of manipulated data item is left in a consistent state.

What are the types of concurrency control?

Concurrency Control Protocols Therefore, these protocols are categorized as: Lock Based Concurrency Control Protocol. Time Stamp Concurrency Control Protocol. Validation Based Concurrency Control Protocol.


you might want to consider using Hazelcast distributed locks. Super lite and easy.

java.util.concurrent.locks.Lock lock = Hazelcast.getLock ("mymonitor");
lock.lock ();
try {
// do your stuff
}finally {
   lock.unlock();
}

Hazelcast - Distributed Queue, Map, Set, List, Lock


We use Terracotta, so I would like to vote for that.

I've been following Hazelcast and it looks like another promising technology, but can't vote for it since I've not used it, and knowing that it uses a P2P based system at its heard, I really would not trust it for large scaling needs.

But I have also heard of Zookeeper, which came out of Yahoo, and is moving under the Hadoop umbrella. If you're adventurous trying out some new technology this really has lots of promise since it's very lean and mean, focusing on just coordination. I like the vision and promise, though it might be too green still.

  • http://www.terracotta.org
  • http://wiki.apache.org/hadoop/ZooKeeper
  • http://www.hazelcast.com

Terracotta is closer to a "tiered" model - all client applications talk to a Terracotta Server Array (and more importantly for scale they don't talk to one another). The Terracotta Server Array is capable of being clustered for both scale and availability (mirrored, for availability, and striped, for scale).

In any case as you probably know Terracotta gives you the ability to express concurrency across the cluster the same way you do in a single JVM by using POJO synchronized/wait/notify or by using any of the java.util.concurrent primitives such as ReentrantReadWriteLock, CyclicBarrier, AtomicLong, FutureTask and so on.

There are a lot of simple recipes demonstrating the use of these primitives in the Terracotta Cookbook.

As an example, I will post the ReentrantReadWriteLock example (note there is no "Terracotta" version of the lock - you just use normal Java ReentrantReadWriteLock)

import java.util.concurrent.locks.*;

public class Main
{
    public static final Main instance = new Main();
    private int counter = 0;
    private ReentrantReadWriteLock rwl = new ReentrantReadWriteLock(true);

    public void read()
    {
        while (true) {
            rwl.readLock().lock();
                try {
                System.out.println("Counter is " + counter);
            } finally {
                rwl.readLock().unlock();
            }
            try { Thread.currentThread().sleep(1000); } catch (InterruptedException ie) {  }
        }
    }

    public void write()
    {
        while (true) {
            rwl.writeLock().lock();
            try {
               counter++;
               System.out.println("Incrementing counter.  Counter is " + counter);
            } finally {
                 rwl.writeLock().unlock();
            }
            try { Thread.currentThread().sleep(3000); } catch (InterruptedException ie) {  }
        }
    }

    public static void main(String[] args)
    {
        if (args.length > 0)  {
            // args --> Writer
            instance.write();
        } else {
            // no args --> Reader
            instance.read();
        }
    }
}

I recommend to use Redisson. It implements over 30 distributed data structures and services including java.util.Lock. Usage example:

Config config = new Config();
config.addAddress("some.server.com:8291");
Redisson redisson = Redisson.create(config);

Lock lock = redisson.getLock("anyLock");
lock.lock();
try {
    ...
} finally {
   lock.unlock();
}

redisson.shutdown();

I was going to advice on using memcached as a very fast, distributed RAM storage for keeping logs; but it seems that EHCache is a similar project but more java-centric.

Either one is the way to go, as long as you're sure to use atomic updates (memcached supports them, don't know about EHCache). It's by far the most scalable solution.

As a related datapoint, Google uses 'Chubby', a fast, RAM-based distributed lock storage as the root of several systems, among them BigTable.


I have done a lot of work with Coherence, which allowed several approaches to implementing a distributed lock. The naive approach was to request to lock the same logical object on all participating nodes. In Coherence terms this was locking a key on a Replicated Cache. This approach doesn't scale that well because the network traffic increases linearly as you add nodes. A smarter way was to use a Distributed Cache, where each node in the cluster is naturally responsible for a portion of the key space, so locking a key in such a cache always involved communication with at most one node. You could roll your own approach based on this idea, or better still, get Coherence. It really is the scalability toolkit of your dreams.

I would add that any half decent multi-node network based locking mechanism would have to be reasonably sophisticated to act correctly in the event of any network failure.


Not sure if I understand the entire context but it sounds like you have 1 single database backing this? Why not make use of the database's locking: if creating the customer is a single INSERT then this statement alone can serve as a lock since the database will reject a second INSERT that would violate one of your constraints (e.g. the fact that the customer name is unique for example).

If the "inserting of a customer" operation is not atomic and is a batch of statements then I would introduce (or use) an initial INSERT that creates some simple basic record identifying your customer (with the necessary UNIQUEness constraints) and then do all the other inserts/updates in the same transaction. Again the database will take care of consistency and any concurrent modifications will result in one of them failing.