Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference in usecases for AWS Sagemaker vs Databricks?

I was looking at Databricks because it integrates with AWS services like Kinesis, but it looks to me like SageMaker is a direct competitor to Databricks? We are heavily using AWS, is there any reason to add DataBricks into the stack or odes SageMaker fill the same role?

like image 798
L Xandor Avatar asked Mar 13 '19 00:03

L Xandor


People also ask

Is SageMaker like Databricks?

Databricks focuses on big data analytics, letting you run your data processing code on compute clusters. Sagemaker focuses on experiment tracking and model deployment. Both tools let data scientists write code in a familiar Notebook environment and run it on scalable infrastructure.

How is Databricks different from AWS?

Although AWS EMR integrates with AWS services, a user has to spend time configuring tools. Whereas when comparing Databricks vs EMR, Databricks allows users with less technical information to perform data science and analytics at scale without much prior knowledge.

What are the limitations of SageMaker?

SageMaker does not allow you to schedule training jobs. SageMaker does not provide a mechanism for easily tracking metrics logged during training. We often fit feature extraction and model pipelines. We can inject the model artifacts into AWS-provided containers, but we cannot inject the feature extractors.

What is the main reason to use AWS SageMaker?

Amazon SageMaker is a fully managed machine learning service. With SageMaker, data scientists and developers can quickly and easily build and train machine learning models, and then directly deploy them into a production-ready hosted environment.


2 Answers

SageMaker is a great tool for deployment, it simplifies a lot of processes configuring containers, you only need to write 2-3 lines to deploy the model as an endpoint and use it. SageMaker also provides the dev platform (Jupyter Notebook) which supports Python and Scala (sparkmagic kernal) developing, and i managed installing external scala kernel in jupyter notebook. Overall, SageMaker provides end-to-end ML services. Databricks has unbeatable Notebook environment for Spark development.

Conclusion

  1. Databricks is a better platform for Big data(scala, pyspark) Developing.(unbeatable notebook environment)

  2. SageMaker is better for Deployment. and if you are not working on big data, SageMaker is a perfect choice working with (Jupyter notebook + Sklearn + Mature containers + Super easy deployment).

  3. SageMaker provides "real time inference", very easy to build and deploy, very impressive. you can check the official SageMaker Github. https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk/scikit_learn_inference_pipeline

like image 108
seninus Avatar answered Sep 22 '22 07:09

seninus


Having worked in both environments within the last year, I specifically remember:

  • Databricks having easy access to stored databases/tables to query out of and use Scala/Spark within the Jupyter Notebooks. I remember how nice it was to just see and preview the schemas and query quickly and be off to the races for research. I also remember the quick functionality to set up a timed job on a Notebook (re-run every month) and re-scale to job instance types (much cheaper) with some button clicks. These functionalities might exist somewhere in AWS, but I remember it being great in Databricks.

  • AWS SageMaker + Lambda + API Gateway: Legitimately, today, I worked through the deployment of AWS SageMaker + Lambda + API Gateway, and after getting used to some syntax and specifics of the Lambda + API Gateway it was pretty straightforward. Doing another AWS deployment wouldn't take more than 20 minutes (pending unique specificities). Other things like Model Monitoring and CloudWatch are nice as well. I did notice Jupyter Notebook Kernels for many languages like Python (what I did it in), R, and Scala, along with specific packages already pre-installed like conda and sagemaker ml packages and methods.

like image 34
kevin_theinfinityfund Avatar answered Sep 22 '22 07:09

kevin_theinfinityfund