Everything we have said so far is about stateless, stateless is really where our applications do not care which network it is using and does not need any permanent storage. Whereas stateful apps and databases for example for such an application to function correctly, you’ll need to ensure that pods can reach each other through a unique identity that does not change (hostnames, IPs...etc.). Examples of stateful applications include MySQL clusters, Redis, Kafka, MongoDB and others. Basically, through any application that stores data.
StatefulSets represent a set of Pods with unique, persistent identities and stable hostnames that Kubernetes maintains regardless of where they are scheduled. The state information and other resilient data for any given StatefulSet Pod are maintained in persistent disk storage associated with the StatefulSet.
Something you will see in our demonstration shortly is that each pod has its own identity. With a stateless Application, you will see random names. For example `app-7469bbb6d7-9mhxd` whereas a Stateful Application would be more aligned to `mongo-0` and then when scaled it will create a new pod called `mongo-1`.
These pods are created from the same specification, but they are not interchangeable. Each StatefulSet pod has a persistent identifier across any rescheduling. This is necessary because when we require stateful workloads such as a database where we require writing and reading to a database, we cannot have two pods writing at the same time with no awareness as this will give us data inconsistency. We need to ensure that only one of our pods is writing to the database at any given time however we can have multiple pods reading that data.
Each pod in a StatefulSet would have access to its persistent volume and replica copy of the database to read from, this is continuously updated from the master. It's also interesting to note that each pod will also store its pod state in this persistent volume, if then `mongo-0` dies then when a new one is provisioned it will take over the pod state stored in storage.
We mentioned above when we have a stateful application, we have to store the state somewhere and this is where the need for a volume comes in, out of the box Kubernetes does not provide persistence out of the box.
We require a storage layer that does not depend on the pod lifecycle. This storage should be available and accessible from all of our Kubernetes nodes. The storage should also be outside of the Kubernetes cluster to be able to survive even if the Kubernetes cluster crashes.
In the session yesterday we walked through creating a stateless application, here we want to do the same but we want to use our minikube cluster to deploy a stateful workload.
A recap on the minikube command we are using to have the capability and addons to use persistence is `minikube start --addons volumesnapshots,csi-hostpath-driver --apiserver-port=6443 --container-runtime=containerd -p mc-demo --kubernetes-version=1.21.2`
There is one more step though that we should run before we start deploying our application and that is to make sure that our storageclass (CSI-hostpath-sc) is our default one. We can firstly check this by running the `kubectl get storageclass` command but out of the box, the minikube cluster will be showing the standard storageclass as default so we have to change that with the following commands.
We will then deploy our YAML file. `kubectl create -f pacman-stateful-demo.yaml` you can see from this command we are creating several objects within our Kubernetes cluster.
You can then see from the next image and command `kubectl get all -n pacman` that we have several things happening inside of our namespace. We have our pods running our NodeJS web front end, we have mongo running our backend database. There are services for both Pacman and mongo to access those pods. We have a deployment for Pacman and a statefulset for mongo.
We also have our persistent volume and persistent volume claim by running `kubectl get pv` will give us our non-namespaced persistent volumes and running `kubectl get pvc -n pacman` will give us our namespaced persistent volume claims.
Because we are using Minikube as mentioned in the stateless application we have a few hurdles to get over when it comes to accessing our application, however, we had access to ingress or a load balancer within our cluster the service is set up to automatically get an IP from that to gain access externally. (you can see this above in the image of all components in the Pacman namespace).
For this demo, we are going to use the port forward method to access our application. By opening a new terminal and running the following `kubectl port-forward svc/pacman 9090:80 -n pacman` command, opening a browser we will now have access to our application. If you are running this in AWS or specific locations then this will also report on the cloud and zone as well as the host which equals your pod within Kubernetes, again you can look back and see this pod name in our screenshots above.
Ok, great we have a high score but what happens if we go and delete our `mongo-0` pod? by running `kubectl delete pod mongo-0 -n pacman` I can delete that and if you are still in the app you will see that high score not available at least for a few seconds.
Now if I go back to my game I can create a new game and see my high scores. The only way you can truly believe me on this though is if you give it a try and share on social media your high scores!
With the deployment, we can scale this up using the commands that we covered in the previous session but in particular here, especially if you want to host a huge Pacman party then you can scale this up using `kubectl scale deployment pacman --replicas=10 -n pacman`
So far with our examples, we have used port-forward or we have used specific commands within minikube to gain access to our applications but this in production is not going to work. We are going to want a better way of accessing our applications at scale with multiple users.
If you are using a cloud provider, a managed Kubernetes offering they most likely will have their ingress option for your cluster or they provide you with their load balancer option. You don't have to implement this yourself, one of the benefits of managed Kubernetes.
I am then told because we are using minikube running on WSL2 in Windows we have to create the minikube tunnel using `minikube tunnel --profile=mc-demo`
If anyone has or can get this working on Windows and WSL I would appreciate the feedback. I will raise an issue on the repository for this and come back to it once I have time and a fix.
UPDATE: I feel like this blog helps identify maybe the cause of this not working with WSL [Configuring Ingress to run Minikube on WSL2 using Docker runtime](https://hellokube.dev/posts/configure-minikube-ingress-on-wsl2/)
This wraps up our Kubernetes section, there is so much additional content we could cover on Kubernetes and 7 days gives us foundational knowledge but people are running through [100DaysOfKubernetes](https://100daysofkubernetes.io/overview.html) where you can get really into the weeds.