Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does RemovalListener callback in guava caching api make sure that no one is using the object

I read the code sample/documentation about caching in the wiki page. I see that callback RemovalListener can be used to do tear down etc of evicted cached objects. My question is does the library make sure that the object is not being used by any other thread before calling the provided RemovalListener. Lets consider the code example from the docs:

CacheLoader<Key, DatabaseConnection> loader = 
                                 new CacheLoader<Key, DatabaseConnection> () {
  public DatabaseConnection load(Key key) throws Exception {
    return openConnection(key);
  }
};
RemovalListener<Key, DatabaseConnection> removalListener =
                          new RemovalListener<Key, DatabaseConnection>() {
  public void onRemoval(RemovalNotification<Key, DatabaseConnection> removal) {
    DatabaseConnection conn = removal.getValue();
    conn.close(); // tear down properly
  }
};

return CacheBuilder.newBuilder()
  .expireAfterWrite(2, TimeUnit.MINUTES)
  .removalListener(removalListener)
  .build(loader);

Here the cache is configured to evict elements 2 minutes after creation (I understand that it may not be exact two minutes because eviction would be piggybacked along with user read/write calls etc.) But whatever time be it, will the library check that there is no active reference present to the object being passed to the RemovalListener? Because I may have another thread who fetched the object from the cache long back but may be still using it. In that case I cannot call close() on it from RemovalListener.

Also the documentation of RemovalNotification says that: A notification of the removal of a single entry. The key and/or value may be null if they were already garbage collected. So according to it conn could be null in the above example. How do we tear down the conn object properly in such case? Also the above code example in such case will throw NullPointerException.

The use case I am trying to address is:

  1. The cache element need to expire after two minutes of creation.
  2. The evicted object needs to be closed, but only afte making sure no one is using them.
like image 644
Arup Malakar Avatar asked Jul 19 '12 15:07

Arup Malakar


People also ask

How does Guava cache work?

Guava provides a very powerful memory based caching mechanism by an interface LoadingCache<K,V>. Values are automatically loaded in the cache and it provides many utility methods useful for caching needs.

Is Guava cache thread safe?

Cache entries are manually added using get(Object, Callable) or put(Object, Object) , and are stored in the cache until either evicted or manually invalidated. Implementations of this interface are expected to be thread-safe, and can be safely accessed by multiple concurrent threads.

How does LoadingCache work?

A LoadingCache is a Cache built with an attached CacheLoader. We configure CacheLoader by pointing it to the backend service, which can then be used to load values using a KEY when it is not already present in Cache.

Is Guava cache distributed?

(Guava caches are local to a single run of your application. They do not store data in files, or on outside servers. If this does not fit your needs, consider a tool like Memcached.)


1 Answers

Guava contributor here.

My question is does the library make sure that the object is not being used by any other thread before calling the provided RemovalListener.

No, that would be impossible for Guava to do generally -- and a bad idea anyway! If the cache values were Integers, then because Integer.valueOf reuses Integer objects for integers below 128, you could never expire an entry with a value below 128. That would be bad.

Also the documentation of RemovalNotification says that: A notification of the removal of a single entry. The key and/or value may be null if they were already garbage collected. So according to it conn could be null in the above example.

To be clear, that's only possible if you're using weakKeys, weakValues, or softValues. (And, as you've correctly deduced, you can't really use any of those if you need to do some teardown on the value.) If you're only using some other form of expiration, you'll never get a null key or value.

In general, I don't think a GC-based solution is going to work here. You must have a strong reference to the connection to close it properly. (Overriding finalize() might work here, but that's really a broken thing generally.)

Instead, my approach would be to cache references to a wrapper of some sort. Something like

 class ConnectionWrapper {
   private Connection connection;
   private int users = 0;
   private boolean expiredFromCache = false;
   public Connection acquire() { users++; return connection; }
   public void release() {
     users--;
     if (users == 0 && expiredFromCache) {
       // The cache expired this connection.
       // We're the only ones still holding on to it.
     }
   }
   synchronized void tearDown() {
     connection.tearDown();
     connection = null; // disable myself
   }

 }

and then use a Cache<Key, ConnectionWrapper> with a RemovalListener that looks like...

 new RemovalListener<Key, ConnectionWrapper>() {
   public void onRemoval(RemovalNotification<Key, ConnectionWrapper> notification) {
     ConnectionWrapper wrapper = notification.getValue();
     if (wrapper.users == 0) {
       // do the teardown ourselves; nobody's using it
       wrapper.tearDown();
     } else {
       // it's still in use; mark it as expired from the cache
       wrapper.expiredFromCache = true;
     }
  }
}

...and then force users to use acquire() and release() appropriately.

There's really not going to be any way better than this approach, I think. The only way to detect that there are no other references to the connection is to use GC and weak references, but you can't tear down a connection without a strong reference to it -- which destroys the whole point. You can't guarantee whether it's the RemovalListener or the connection user who'll need to tear down the connection, because what if the user takes more than two minutes to do its thing? I think this is probably the only feasible approach left.

(Warning: the above code assumes only one thread will be doing things at a time; it's not synchronized at all, but hopefully if you need it, then this is enough to give you an idea of how it should work.)

like image 109
Louis Wasserman Avatar answered Sep 21 '22 03:09

Louis Wasserman