Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Presto with Kubernetes

We are trying to implement Presto with Kubernetes. We have a kubernetes cluster running on cloud as a service. I tried to google on this but could not find a conclusive result as to what may be the best practices to deploy Presto with Kubernetes. Though there exists the official github of Presto - but does not help. Below are the two questions I am trying to seek an answer for:

  1. What should be the best approach to configure Presto with Kubernetes - metrics such as ideal worker replicas?
  2. How can we go ahead and performance test this deployment?
like image 404
Anshul Verma Avatar asked Sep 11 '18 08:09

Anshul Verma


2 Answers

You could install with the official helm chart from https://github.com/helm/charts/tree/master/stable/presto It provides an option to set the number of workers. With the official chart you should be able to ask questions in the Kubernetes charts slack channel (through http://slack.k8s.io) and raise issues in GitHub if you hit any. Or there are non-helm examples such as https://github.com/dharmeshkakadia/presto-kubernetes

The question of how many workers isn't specific to Kubernetes. It's a question of how much and what kind of load you will need the deployment to handle and will also depend on what hardware your Kubernetes cluster is using. If you're not sure then perhaps you can deploy with the defaults and adjust as needed. This is suggested by https://prestodb.io/presto-admin/docs/current/installation/presto-configuration.html You'll find some of the settings such as memory per node set in the Deployment parts of the kubenernetes yaml descriptors or in the values.yaml in the case of the helm chart.

To performance test your deployment you will need test data and can then run queries against the cluster. So the same process you would follow outside of Kubernetes. There are tools to help such as https://www.lewuathe.com/use-benchto-for-evaluation-of-presto.html or https://github.com/prestodb/tempto You may also want to look at https://kognitio.com/blog/presto-performance-powerful-or-problematic/

like image 114
Ryan Dawson Avatar answered Oct 28 '22 13:10

Ryan Dawson


There are a couple of examples of how it could be achieved available, for example dharmeshkakadia/presto-kubernetes but I guess you might want to use a StatefulSet here, rather. Not sure concerning perf tests because much of it will depend on the kind of persistent volume you choose or better say by what it is backed, for example NFS, Ceph, or maybe you are in a cloud environment with native storage?

like image 24
Michael Hausenblas Avatar answered Oct 28 '22 12:10

Michael Hausenblas