Has anyone managed to run a H2O Cluster in Kubernetes?
I tried 2 options both using flatfile 1) using StatefulSet, but since the ip generated for the pod can change the cluster is unreliable 2) using a bunch of pairs of service/deployments and specifying the the flatfile the dns name of the service but the cluster doesn't start up correctly
none of the above work. Is there any way to make it work?
Kubernetes is an open-source container management platform that unifies a cluster of machines into a single pool of compute resources. With Kubernetes, you organize your applications in groups of containers, which it runs using the Docker engine, taking care of keeping your application running as you request.
By the way, if you're wondering where the name “Kubernetes” came from, it is a Greek word, meaning helmsman or pilot. The abbreviation K8s is derived by replacing the eight letters of “ubernete” with the digit 8.
A Kubernetes pod is a collection of one or more Linux® containers, and is the smallest unit of a Kubernetes application. Any given pod can be composed of multiple, tightly coupled containers (an advanced use case) or just a single container (a more common use case).
Here's a quick list to understand this: Containers are packages of applications and execution environments. Pods are collections of closely-related or tightly coupled containers. Nodes are computing resources that house pods to execute workloads.
If multicast packets can be transmitted between the pods, then you could rely on that for the cluster formation. Just specify a unique -name for all the nodes to share. This is easy if it works, with no code changes.
UPDATE (2018/04/21) -- one of my colleagues says:
I used weave as the network layer, what that does is provide a connection between all the containers for that kubernetes pod group, then you dont need to use the flatfile in H2O, as h2o will multicast on startup, weave will take the multicast and send it to all instances of the pod.
in K8s run this: kubectl apply --filename https://git.io/weave-kube-1.6
If multicast is not an option, there isn't an out-of-the-box solution today for Kubernetes that I'm aware of.
You will need an orchestrator to distribute the flatfile information.
There are at least three examples of code to do this for other environments in the H2O github repos.
https://github.com/h2oai/h2o-3/tree/master/ec2
https://github.com/h2oai/h2o-3/blob/master/h2o-hadoop/h2o-mapreduce-generic/src/main/java/water/hadoop/h2omapper.java
In particular, look at how this class gets overridden:
https://github.com/h2oai/h2o-3/blob/master/h2o-core/src/main/java/water/init/AbstractEmbeddedH2OConfig.java
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With