Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bytes currently in use by a Google Guava cache?

I'd like to find out how many bytes are being using in the cache. This is useful in determining reasonable sizing. What are some good ways to tally the number of bytes used in a Google Guava cache?

The stats method doesn't give what I want; it does not include metrics on the number of bytes in the cache.

The asMap method is the best way I've found so far. After getting this information, one could use some of the techniques shown in In Java, what is the best way to determine the size of an object?. But, frankly, these seem fairly painful, at least from a Clojure codebase. In order to avoid some dependencies, I'm currently using a rough shortcut with Nippy, a Clojure serialization library: (count (nippy/freeze (.asMap cache))). I'm looking for better ways.

I am using Google Guava caching from a Clojure codebase, but my question is not necessarily Clojure specific; Java interop is relatively easy in most cases.

Update: Some context in response to a comment below. First, I'd like to know if I'm overlooking a useful part of the Google Guava caching API. Second, I don't know if the generic approaches I linked (for counting memory usage on the heap) apply well to Guava. More broadly, finding cache size utilization is an important use case, so I'm a little surprised it isn't better documented online.

like image 562
David J. Avatar asked Sep 05 '14 18:09

David J.


People also ask

What is Google Guava cache?

The Guava Cache is an incremental cache, in the sense that when you request an object from the cache, it checks to see if it already has the corresponding value for the supplied key. If it does, it simply returns it (assuming it hasn't expired).

Where is Guava cache stored?

1 Answer. Show activity on this post. Guava Caches store values in RAM.

Is Guava cache on heap?

Does Guava cache also use hardDisk storage when RAM storage is not available ? No, the Guava cache is strictly on-heap. It's not going to make much difference how big it is, it's just a fancy hash map.

Is Google Guava cache distributed?

Hazelcast is for distributed caching, meaning many services share the same cache, whereas Guava/Caffeine is a local cache per each service (not shared).


1 Answers

Java (and by extension Guava) does not provide any easy or meaningful way to measure "bytes used" by an object or data structure. Notably there isn't even a single coherent definition of that concept, since an object can be referenced from multiple other objects and there's no notion of bytes being "owned" by a particular data structure. Other languages like Rust have this notion of ownership, but Java does not.

For example, how many bytes does an instance of MyClass use?

public class MyClass {
  private static final int[] BIG_ARRAY = new int[1_000_000];
  private final int[] myArray = BIG_ARRAY;
}

Clearly the class uses a lot of memory, but each instance only uses up a few bytes in order to reference the statically allocated array. You can create thousands of MyClass instances and see very little memory impact, and even if all instances are GCed BIG_ARRAY will stick around. So it seems wrong to say that an instance of MyClass "uses" the backing array's bytes.

You can determine how many bytes the cache itself uses (e.g. to compare it to using a ConcurrentHashMap or another collection) based on the fields and instances a Cache maintains. Guava links to this helpful resource of data structure memory footprints you can reference, but obviously this won't include the contents of the cache, just its structure.

Needless to say, as Louis Wasserman commented, you should look for a different metric that more directly tells you what you need to know. For instance you might be more interested in the hit rate, which tells you how efficiently you're using whatever data your cache is retaining.

like image 181
dimo414 Avatar answered Sep 24 '22 12:09

dimo414