I have to look into solutions for providing a MySQL database that can handle data volumes in the terabyte range and be highly available (five nines). Each database row is likely to have a timestamp and up to 30 float values. The expected workload is up to 2500 inserts/sec. Queries are likely to be less frequent but could be large (maybe involving 100Gb of data) though probably only involving single tables.
I have been looking at MySQL Cluster given that is their HA offering. Due to the volume of data I would need to make use of disk based storage. Realistically I think only the timestamps could be held in memory and all other data would need to be stored on disk.
Does anyone have experience of using MySQL Cluster on a database of this scale? Is it even viable? How does disk based storage affect performance?
I am also open to other suggestions for how to achieve the desired availability for this volume of data. For example, would it be better to use a third party libary like Sequoia to handle the clustering of standard MySQL instances? Or a more straight forward solution based on MySQL replication?
The only condition is that it must be a MySQL based solution. I don't think that MySQL is the best way to go for the data we are dealing with but it is a hard requirement.
Speed wise, it can be handled. Size wise, the question is not the size of your data, but rather the size of your index as the indices must fit fully within memory.
I'd be happy to offer a better answer, but high-end database work is very task-dependent. I'd need to know a lot more about what's going on with the data to be of further help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With