Is it reasonable to use Kubernetes for a clustered database such as MySQL in production environment?
There are example configurations such as mysql galera example. However, most examples do not make use of persistent volumes. As far as I've understood persistent volumes must reside on some shared file system as defined here Kubernetes types of persistent volumes. A shared file system will not guarantee that the database files of the pod will be local to the machine hosting the pod. It will be accessed over network which is rather slow. Moreover, there are issues with MySQL and NFS, for example.
This might be acceptable for a test environment. However, what should I do in a production environment? Is it better to run the database cluster outside Kubernetes and run only application servers with Kubernetes?
Better resource utilization But running Kubernetes provides an infrastructure-as-code approach to these challenges. This makes it easy to handle multiple microservices deployments at scale, while optimizing resource utilization on the available nodes. This is really one of the best arguments for Kubernetes.
The features of CockroachDB, such as symmetrical instances, automatic fail-over, and distributed architecture, make it the perfect choice for Kubernetes-related workloads. Furthermore, their open-source Kubernetes operator can automate deployment and manage the overall Kubernetes cluster.
If you're working on a small project, and are deploying to a single machine, it's completely okay to run your database in a Docker container. Be sure to mount a volume to make the data persistent, and have backup processes in place. Try to restore them every once in a while to make sure your backups are any good.
Kubernetes is complex and it becomes more complex still when you prepare your application for production. Yet with the right tools and processes, you can harness its power and build on it, delivering a secure, reliable production application for which you can deliver new features quickly and without disruption.
The Kubernetes project introduced PetSets, a new pod management abstraction, intended to run stateful applications. It is an alpha feature at present (as of version 1.4) and moving rapidly. A list of the various issues as we move to beta are listed here. Quoting from the section on when to use petsets:
A PetSet ensures that a specified number of "pets" with unique identities are running at any given time. The identity of a Pet is comprised of:
In addition to the above, it can be coupled with several other features which help one deploy clustered stateful applications and manage them. Coupled with dynamic volume provisioning for example, it can be used to provision storage automatically.
There are several YAML configuration files available (such as the ones you referenced) using ReplicaSets and Deployments for MySQL and other databases which may be run in production and are probably being run that way as well. However, PetSets are expected to make it a lot easier to run these types of workloads, while supporting upgrades, maintenance, scaling and so on.
You can find some examples of distributed databases with petsets here.
The advantage of provisioning persistent volumes which are networked and non-local (such as GlusterFS) is realized at scale. However, for relatively small clusters, there is a proposal to allow for local storage persistent volumes in the future.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With