Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does couchbase actually support datasets larger than memory?

Tags:

couchbase

Couchbase documentation says that "Disk persistence enables you to perform backup and restore operations, and enables you to grow your datasets larger than the built-in caching layer," but I can't seem to get it to work.

I am testing Couchbase 2.5.1 on a three node cluster, with a total of 56.4GB memory configured for the bucket. After ~124,000,000 100-byte objects -- about 12GB of raw data -- it stops accepting additional puts. 1 replica is configured.

Is there a magic "go ahead and spill to disk" switch that I'm missing? There are no suspicious entries in the errors log.

like image 254
user3691683 Avatar asked May 30 '14 16:05

user3691683


2 Answers

It does support data greater than memory - see Ejection and working set management in the manual.

In your instance, what errors are you getting from your application? When you start to reach the low memory watermark, items need to be ejected from memory to make room for newer items.

Depending on the disk speed / rate of incoming items, this can result in TEMP_OOM errors being sent back to the client - telling it needs to temporary back off before performing the set, but these should generally be rare in most instances. Details on handling these can be found in the Developer Guide.

like image 176
DaveR Avatar answered Oct 05 '22 06:10

DaveR


My guess would be that it's not the raw data that is filling up your memory, but the metadata associated with it. Couchbase 2.5 needs 56 bytes per key, so in your case that would be approximately 7GB of metadata, so much less than your memory quota.

But... metadata can be fragmented on memory. If you batch-inserted all the 124M objects in a very short time, I would assume that you got at least a 90% fragmentation. That means that with only 7GB of useful metadata, space required to hold it has filled up your RAM, with lots of unused parts in each allocated block.

The solution to your problem is to defragment the data. It can either be achieved manually or triggered as needed :

  • manually : enter image description here
  • automatically : enter image description here

If you need more insights about why compaction is needed, you can read this blog article from Couchbase.

like image 42
Mickaël Le Baillif Avatar answered Oct 05 '22 05:10

Mickaël Le Baillif