Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Astyanax's EntityPersister & Collection Updates

Background

Astyanax's Entity Persister saves a Map of an Entity in multiple columns. The format is mapVariable.key

The Problem:

The astyanax's Entity Persister doesn't remove deleted key/value pairs from cassandra when a Map in an Entity has been updated

The Solution I'm Using Now (bad approach)

I'm deleting the whole row, and then reinsert it

Some More Info

I persist my java objects in cassandra using astyanax's Entity Persister (com.netflix.astyanax.entitystore).

What I've noticed is that when an Entity's Map is persisted with, say, 2 values: testkey:testvalue & testkey2:testvalue2, and the next time the same Entity's Map is persisted with one value (one key/value pair was removed): testkey:testvalue, the testkey2:testvalue2 isn't deleted from the column family.

So, as a work-around, I need to delete the whole row and then reinsert it.

My insertion code:

        final EntityManager<T, String> entityManager = new DefaultEntityManager.Builder<T, String>()
            .withEntityType(clazz)
            .withKeyspace(getKeyspace())
            .withColumnFamily(columnFamily)
            .build();
    entityManager.put(entity);

What am I missing? This is really inefficient and I think astyanax's entity persister is supposed to take care of this on its own.

Any thoughts?

like image 814
Vladimir Avatar asked May 21 '13 15:05

Vladimir


2 Answers

You are not missing anything.

What happens is the following: 1. Astyanax creates a list of ColumnMappers one for each field of the entity under serialization. 2. Then, ColumnMappers take turns populating mutation batch. 3. For maps, MapColumnMapper is used. If you take a look at its code, you will see that it just adds key:value pairs to mutation batch. 4. When data is put in a row in cassandra, new columns from the batch are added, existing ones are overwritten, old ones unfortunately remain the same.

One solution here would be to write a custom serializer for your map and save it in one field.

like image 64
ursus Avatar answered Oct 05 '22 20:10

ursus


Take a look at https://github.com/deanhiller/playorm its orm mapper for cassandra, hbase and few other nosql databases. It removes items from collections on save. Its also way easier to use then astyanax.

like image 32
Radim Kolář Avatar answered Oct 05 '22 19:10

Radim Kolář