Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to serialise/deserialise long[] value with get/set on random indices using Chronicle Map?

I am new to chronicle-map. I am trying to model an off heap map using chronicle-map where the key is a primitive short and the value is a primitive long array. The max size of the long array value is known for a given map. However I will have multiple maps of this kind each of which may have a different max size for the long array value. My question relates to the serialisation/deserialisation of the key and value.

From reading the documentation I understand that for the key I can use the value type ShortValue and reuse the instance of the implementation of that interface. Regarding the value I have found the page talking about DataAccess and SizedReader which gives an example for byte[] but I'm unsure how to adapt this to a long[]. One additional requirement I have is that I need to get and set values at arbitrary indices in the long array without paying the cost of a full serialisation/deserialisation of the entire value each time.

So my question is: how can I model the value type when constructing the map and what serialisation/deserialisation code do I need for a long[] array if the max size is known per map and I need to be able to read and write random indices without serialising/deserialising the entire value payload each time? Ideally the long[] would be encoded/decoded directly to/from off heap without undergoing an on heap intermediate conversion to a byte[] and also the chronicle-map code would not allocate at runtime. Thank you.

like image 792
junkie Avatar asked Feb 06 '18 19:02

junkie


1 Answers

Answering extra questions:

I've implemented a SizedReader+Writer. Do I need DataAccess or is SizedWriter fast enough for primitive arrays? I looked at the ByteArrayDataAccess but it's not clear how to port it for long arrays given that the internal HeapBytesStore is so specific to byte[]/ByteBuffers?

Usage of DataAccess instead of SizedWriter allows to make one less value data copy on Map.put(key, value). However, if in your use case putOneValue() (as in the example above) is the dominating type of query, it won't make much difference. If Map.put(key, value) (and replace(), etc., i. e. any "full value write" operations) are important, it is still possible to implement DataAccess for LongList. It will look like this:

class LongListDataAccess implements DataAccess<LongList>, Data<LongList>,
        StatefulCopyable<LongListDataAccess> {
    transient ByteStore cachedBytes;
    transient boolean cachedBytesInitialized;
    transient LongList list;

    @Override public Data<LongList> getData(LongList list) {
        this.list = list;
        this.cachedBytesInitialized = false;
        return this;
    }

    @Override public long size() {
        return ((long) list.size()) * Long.BYTES;
    }

    @Override public void writeTo(RandomDataOutput target, long targetOffset) {
        for (int i = 0; i < list.size(); i++) {
            target.writeLong(targetOffset + ((long) i) * Long.BYTES), list.get(i));
        }
    }

    ...
}

For efficiency, the methods size() and writeTo() are key. But it's important to implement all other methods (which I didn't write here) correctly too. Read DataAccess, Data and StatefulCopyable javadocs very carefully, and also Understanding StatefulCopyable, DataAccess and SizedReader and Custom serialization checklist in the tutorial with great attention too.


Does the read/write locking mediate across multiple process reading and writing on same machine or just within a single process?

It's safe accross processes, note that the interface is called InterProcessReadWriteUpdateLock.


When storing objects, with a variable size not known in advance, as values will that cause fragmentation off heap and in the persisted file?

Storing value for a key once and not changing the size of the value (and not removing keys) after that won't cause external fragmentation. Changing size of the value or removing keys could cause external fragmentation. ChronicleMapBuilder.actualChunkSize() configuration allows to trade between external and internal fragmentation. The bigger the chunk, the less external fragmentation, but the more internal fragmentation. If your values are significantly bigger than page size (4 KB), you could set up absurdly big chunk size and still have internal fragmentation bound by the page size, because Chronicle Map is able to exploit lazy page allocation feature in Linux.

like image 180
leventov Avatar answered Nov 10 '22 04:11

leventov