Why StatefulSets? Can't a stateless Pod use persistent volumes?

Tags:

I am trying to understand Stateful Sets. How does their use differ from the use of "stateless" Pods with Persistent Volumes? That is, assuming that a "normal" Pod may lay claim to persistent storage, what obvious thing am I missing that requires this new construct (with ordered start/stop and so on)?

893

asked Jan 19 '17 02:01

Laird Nelson

2 Answers

Yes, a regular pod can use a persistent volume. However, sometimes you have multiple pods that logically form a "group". Examples of this would be database replicas, ZooKeeper hosts, Kafka nodes, etc. In all of these cases there's a bunch of servers and they work together and talk to each other. What's special about them is that each individual in the group has an identity. For example, for a database cluster one is the master and two are followers and each of the followers communicates with the master letting it know what it has and has not synced. So the followers know that "db-x-0" is the master and the master knows that "db-x-2" is a follower and has all the data up to a certain point but still needs data beyond that.

In such situations you need a few things you can't easily get from a regular pod:

A predictable name: you want to start your pods telling them where to find each other so they can form a cluster, elect a leader, etc. but you need to know their names in advance to do that. Normal pod names are random so you can't know them in advance.
A stable address/DNS name: you want whatever names were available in step (1) to stay the same. If a normal pod restarts (you redeploy, the host where it was running dies, etc.) on another host it'll get a new name and a new IP address.
A persistent link between an individual in the group and their persistent volume: if the host where one of your database master was running dies it'll get moved to a new host but should connect to the same persistent volume as there's one and only 1 volume that contains the right data for that "individual". So, for example, if you redeploy your group of 3 database hosts you want the same individual (by DNS name and IP address) to get the same persistent volume so the master is still the master and still has the same data, replica1 gets it's data, etc.

StatefulSets solve these issues because they provide (quoting from https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/):

Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, graceful deletion and termination.

I didn't really talk about (3) and (4) but that can also help with clusters as you can tell the first one to deploy to become the master and the next one find the first and treat it as master, etc.

As some have noted, you can indeed can some of the same benefits by using regular pods and services, but its much more work. For example, if you wanted 3 database instances you could manually create 3 deployments and 3 services. Note that you must manually create 3 deployments as you can't have a service point to a single pod in a deployment. Then, to scale up you'd manually create another deployment and another service. This does work and was somewhat common practice before PetSet/PersistentSet came along. Note that it is missing some of the benefits listed above (persistent volume mapping & fixed start order for example).

166

answered Sep 21 '22 05:09

Oliver Dain

1: Why StatefulSets?

Stateless app: Usually, frontend components have completely different scaling requirements than the backends, so we tend to scale them individually. Not to mention the fact that backends such as databases are usually much harder to scale compared to (stateless) frontend web servers. Yes, the term “stateless” means that no past data nor state is stored or needs to be persistent when a new container is created

Stateful app: Stateful applications typically involve some database, such as Cassandra, MongoDB, or MySQL and processes a read and/or write to it.

2: Can't a stateless Pod use persistent volumes?

Basically, there are few ways by which you can do it. However, it has its own disadvantages.

1: USING ONE REPLICASET PER POD INSTANCE

you could create multiple ReplicaSets—one for each pod with each ReplicaSet’s desired replica count set to one, and each ReplicaSet’s pod template referencing a dedicated PersistentVolumeClaim.

enter image description here

Although this takes care of the automatic rescheduling in case of node failures or accidental pod deletions, it’s much more cumbersome compared to having a single ReplicaSet.
For example, think about how you’d scale the pods in that case. You couldn’t change the desired replica count you’d have to create additional ReplicaSets instead. Using multiple ReplicaSets is therefore not the best solution.

2: USING MULTIPLE DIRECTORIES IN THE SAME VOLUME

A trick you can use is to have all pods use the same PersistentVolume, but then have a separate file directory inside that volume for each pod Because you can’t configure pod replicas differently from a single pod template, you can’t tell each instance what directory it should use, but you can make each instance automatically select (and possibly also create) a data directory that isn’t being used by any other instance at that time.

enter image description here

This solution does require coordination between the instances, and isn’t easy to do correctly. It also makes the shared storage volume the bottleneck.

That's why one should encourage to use statefulsets

answered Sep 21 '22 05:09

Gupta

Related questions
                            
                                Privileged containers and capabilities
                            
                                Helm V3 - Cannot find the official repo
                            
                                How to set dynamic values with Kubernetes yaml file
                            
                                Disabling cronjob in Kubernetes
                            
                                Clean up "Replica Sets" when updating deployments?
                            
                                How to delete a node label by command and api?
                            
                                Difference between Docker ENTRYPOINT and Kubernetes container spec COMMAND?
                            
                                Decoding Kubernetes secret
                            
                                Checking kubernetes pod CPU and memory
                            
                                How to gracefully remove a node from Kubernetes?
                            
                                What is an 'endpoint' in Kubernetes?
                            
                                How to merge kubectl config file with ~/.kube/config?
                            
                                kubernetes list all running pods name
                            
                                Get error "unknown field "serviceName" in io.k8s.api.networking.v1.IngressBackend" when switch from v1beta1 to v1 in Kubernetes Ingress
                            
                                For a helm chart, what versions are available? [closed]
                            
                                kubectl get events only for a pod
                            
                                kubectl error You must be logged in to the server (Unauthorized) when accessing EKS cluster
                            
                                What is command to find detailed information about Kubernetes master(s) using kubectl?
                            
                                Helm install unknown flag --name
                            
                                What is the meaning of ImagePullBackOff status on a Kubernetes pod?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why StatefulSets? Can't a stateless Pod use persistent volumes?

Tags:

kubernetes

kubernetes-statefulset