I am developing a multi-threaded application and using Cassandra for the back-end.
Earlier, I created a separate session for each child thread and closed the session before killing the thread after its execution. But then I thought it might be an expensive job so I now designed it like, I have a single session opened at the start of the server and any number of clients can use that session for querying purposes.
Question: I just want to know if this is correct, or is there a better way to do this? I know connection pooling is an option but, is that really needed in this scenario?
Session instances are thread-safe and usually a single instance is enough per application.
Benchmarking Cassandra There were also multi-threaded benchmarks to characterize performance for customers running a more even distribution of work across multiple routes.
It's certainly thread safe in the Java driver, so I assume the C++ driver is the same.
You are encouraged to only create one session and have all your threads use it so that the driver can efficiently maintain a connection pool to the cluster and process commands from your client threads asynchronously.
If you create multiple sessions on one client machine or keep opening and closing sessions, you would be forcing the driver to keep making and dropping connections to the cluster, which is wasteful of resources.
Quoting this Datastax blog post about 4 simple rules when using the DataStax drivers for Cassandra:
- Use one Cluster instance per (physical) cluster (per application lifetime)
- Use at most one Session per keyspace, or use a single Session and explicitely specify the keyspace in your queries
- If you execute a statement more than once, consider using a PreparedStatement
- You can reduce the number of network roundtrips and also have atomic operations by using Batches
The C/C++ driver is definitely thread safe at the session and future levels.
The CassSession object is used for query execution. Internally, a session object also manages a pool of client connections to Cassandra and uses a load balancing policy to distribute requests across those connections. An application should create a single session object per keyspace as a session object is designed to be created once, reused, and shared by multiple threads within the application.
They actually have a section called Thread Safety:
A CassSession is designed to be used concurrently from multiple threads. CassFuture is also thread safe. Other than these exclusions, in general, functions that might modify an object’s state are NOT thread safe. Objects that are immutable (marked ‘const’) can be read safely by multiple threads.
They also have a note about freeing objects. That is not thread safe. So you have to make sure all your threads are done before you free objects:
NOTE: The object/resource free-ing functions (e.g. cass_cluster_free, cass_session_free, … cass_*_free) cannot be called concurrently on the same instance of an object.
Source:
http://datastax.github.io/cpp-driver/topics/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With