Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Coherence Topology Suggestion

Data to be cached:

  • 100 Gb data
  • Objects of size 500-5000 bytes
  • 1000 objects updated/inserted in average per minute (peak 5000)

Need suggestion for Coherence topology in production and test (distributed with backup)

  • number of servers
  • nodes per server
  • heap size per node

Questions

  • How much free available memory is needed per node compared to memory used by cached data (assume 100% usage is not possible)
  • How much overhead will 1-2 additional indexes per cache element generate?

We do not know how many read operations will be done, the solution will be used by clients where low response times are critical (more than data consistency) and depend on each use-case. The cache will be updated from DB by polling at a fixed frequency and populating the cache (since cache is data master, not the system using the cache).

like image 625
Audun Avatar asked May 10 '11 10:05

Audun


2 Answers

The rule of thumb for sizing a JVM for Coherence is that the data is 1/3 the heap assuming 1 backup: 1/3 for cache data, 1/3 for backup, and 1/3 for index and overhead.

The biggest difficulty in sizing is that there are no good ways to estimate index sizes. You have to try with real-world data and measure.

A rule of thumb for JDK 1.6 JVMs is start with 4GB heaps, so you would need 75 cache server nodes. Some people have been successful with much larger heaps (16GB), so it is worth experimenting. With large heaps (e.g, 16GB) you should not need as much as 1/3 for overhead and can hold more than 1/3 for data. With heaps greater than 16GB, garbage collector tuning becomes critical.

For maximum performance, you should have 1 core per node.

The number of server machines depends on practical limits of manageability, capacity (cores and memory), and failure. For example, even if you have a server that can handle 32 nodes, what happens to your cluster when a machine fails? The cluster will be machine safe (backups are not on the same machine) but recovery would be very slow given the massive amount of data to be moved to new backups. On the other hand 75 machines is hard to manage.

I've seen Coherence have latencies of 250 micro seconds (not millis) for a 1K object put, including network hops and backup. So, the number of inserts and updates you are looking for should be achievable. Test with multiple threads inserting/updating and make sure your test client is not the bottleneck.

like image 83
David G Avatar answered Sep 30 '22 07:09

David G


A few more "rules of thumb":

1) For high availability, three nodes is a good minimum.

2) With Java 7, you can use larger heaps (e.g. 27GB) and the G1 garbage collector.

3) For 100GB of data, using David's guidelines, you will want 300GB total of heap. On servers with 128GB of memory, that could be done with 3 physical servers, each running 4 JVMs with 27GB heap each (~324GB total).

4) Index memory usage varies significantly with data type and arity. It is best to test with a representative data set, both with and without indexes, to see what the memory usage difference is.

like image 42
cpurdy Avatar answered Sep 30 '22 07:09

cpurdy