Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JUnit Testing Cassandra with embedded server

What is the best approach to write unit tests for code that persists data to nosql data store, in our case cassandra?

=> We are using embedded server approach using a utility from git hub (https://github.com/hector-client/hector/blob/master/test/src/main/java/me/prettyprint/hector/testutils/EmbeddedServerHelper.java). However I have been seeing some issues with this. 1) It persists data across multiple test cases making it hard for us to make sure data is different in test cases of a test class. I tried calling cleanUp @After each test case, but that doesn't seem to cleanup data. 2) We are running out of memory as we add more tests and this could be because of 1, but I am not sure yet on that. I currently have 1G heap size to run my build.

=> The other approach I have been thinking is to mock the cassandra storage. But that might leak some issues in the cassandra schema as we often found the above approach catching issues with the way data is stored into cassandra.

Please let me know you thoughts on this and if anyone has used EmbeddedServerHelper and are familiar with the issues I have mentioned.


Just an update. I was able to resolve 2) running out of java heap space issue when running builds by changing the in_memory_compaction_limit_in_mb parameter to 32 in the cassandra.yaml used by the test embedded server. The below link helped me http://www.datastax.com/docs/0.7/configuration/storage_configuration#in-memory-compaction-limit-in-mb. It was 64 and started to fail consistently during compaction.

like image 982
bobbypavan Avatar asked Jul 07 '11 14:07

bobbypavan


1 Answers

We use an embedded cassandra server, and I think that is the best approach when testing cassandra, mocking the cassandra API is too error prone.

EmbeddedServerHelper.cleanup() just removes files rom the file system, but data may still exist in memory.

There is a teardown() method in EmbeddedServerHelper, but I a not sure how effective that is, as cassandra has a lot of static singletons whose state is not cleaned up by teardown()

What we do is we have a method that calls truncate on each column family between tests. That will remove all data.

like image 124
sbridges Avatar answered Sep 29 '22 17:09

sbridges