Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Directly accessible data structure Java

I have the following situation:

  1. A data structure which can only ever be extended ( I only ever add things in the tail)
  2. I need to be able to keep track of which elements I have already seen (I have an index, and ideally I want to be able to start traversing the list again from this particular element)
  3. I would like the reads to never be blocking, and the addition of the new element to only ever lock the tail of the queue rather than the whole queue

This is a structure which is heavily modified by multiple threads.

What would be the best data structure for this?

ArrayList. This would be ideal to be able to directly access the last element seen using the index, but it leads to concurrent modifications exceptions. i could make it synchronised, but would like to avoid locking (or any locking apart from the very last element, as it is the only one where there might be concurrent writes to add new elements)

ConcurrentLinkedQueue. This would solve my concurrency problem, but has the problem that I would have to store the current position of the iteration rather than an integer index. This has the problem that it returns a weakly consistent iterator which is not guaranteed to return new objects that have been added to the list since the iterator was created (source: javadoc)

ConcurrentHashMap with index as keys. This has the benefit that I can access data corresponding to the correct index directly, but has the issue that there isn't a "getNext" operator that will allow me to efficiently traverse the elements from index, to index + 1, etc

Vectors This would solve most of my problems in allowing something that won't throw concurrent modification exceptions and allow for direct accessing. However, given all methods are synchronised, the performance is poor compared to arraylists. Given that I only ever want to extend the structure, and not insert records in the middle, I'm reluctant to go for this heavy weight solution, where reads also suffer a performance hit (whereas, given my usecase, the index of an element never actually changes, so there's no need to synchronise reads that are not the tail)

Custom data structure: keep an array of the objects I want to store and a pointer to the tail of this array (the last element set), when inserting a new object, lock the tail and the object pointed to by the tail. When the object exceeds its current size, to a locking resize operation.

What would be the best strategy/ any other more efficient implementation?

like image 965
user1018513 Avatar asked Apr 25 '13 09:04

user1018513


People also ask

Which data structure is best in Java?

Arrays. An array is the simplest and most widely used data structure. Other data structures like stacks and queues are derived from arrays.

Which data structure's elements can be accessed directly by their position?

Array supports Random Access, which means elements can be accessed directly using their index, like arr[0] for 1st element, arr[6] for 7th element etc. Hence, accessing elements in an array is fast with a constant time complexity of O(1) .

Does Java have built-in data structures?

In the java. util package, ArrayList is a built-in data structure very similar to Array . The difference between Array and ArrayList is the size. In Array , the length is fixed, so the memory allocated for storing values is fixed.


3 Answers

The CopyOnWriteArrayList structure could solve your problem (java.util.concurrent).

  • CopyOnWriteArrayLists is thread-safe because all mutative operations are implemented by creating a copy of the list.

  • The problem of ConcurrentModificationException is avoided because the array doesn't change while iterated. The so called snapshot style iterator uses a reference to the state of the array when the iterator was created.

  • If you have much more reads than writes, use CopyOnWriteArrayList, otherwise use Vector.

  • Vector introduces a small synchronization delay for each operation, when CopyOnWriteArrayList has a longer delay for write (due to copying) but no delay for reads.

  • Vector requires explicit synchronization when you are iterating it (so write operations can't be executed at the same time), CopyOnWriteArrayList doesn't.

like image 72
Xaltar Avatar answered Oct 06 '22 12:10

Xaltar


Looking into this I came to the same solution as @MissingNumber.

Use a ConcurrentHashMap as backing data structure:

  • non-blocking-reads
  • thread-safe appending

To add the random access by index use an AtomicInteger to maintain the index and put it as the key to retrieve the map values.

public class ConcurrentListMap {

  private final ConcurrentHashMap<Integer, Object> backingMap;
  private final AtomicInteger index;

  public ConcurrentListMap() {
    backingMap = new ConcurrentHashMap();
    index = new AtomicInteger(0);
  }

  public int append(final Object value) {
    final int newIndex = index.incrementAndGet();
    backingMap.put(newIndex, value);
    return newIndex;
  }

  public Object get(final int entry) {
    return backingMap.get(entry);
  }

  public int getTailIndex() {
    return index.get();
  }
}
like image 26
OliverS Avatar answered Oct 06 '22 11:10

OliverS


This very much sounds like you would need a distruptor or in simple words lock free queue. I wish I could add an example here, but I only started to work on it yesterday. I could also tell you how it works, or you can read a far better explanation here:

The general idea is that this is completely lock free, it only uses the CAS registers (in java AtomicXXX). I simply fell in love with the idea.

LMAX

like image 4
Eugene Avatar answered Oct 06 '22 12:10

Eugene