Difference in usecases for AWS Sagemaker vs Databricks?

Tags:

I was looking at Databricks because it integrates with AWS services like Kinesis, but it looks to me like SageMaker is a direct competitor to Databricks? We are heavily using AWS, is there any reason to add DataBricks into the stack or odes SageMaker fill the same role?

798

asked Mar 13 '19 00:03

L Xandor

2 Answers

SageMaker is a great tool for deployment, it simplifies a lot of processes configuring containers, you only need to write 2-3 lines to deploy the model as an endpoint and use it. SageMaker also provides the dev platform (Jupyter Notebook) which supports Python and Scala (sparkmagic kernal) developing, and i managed installing external scala kernel in jupyter notebook. Overall, SageMaker provides end-to-end ML services. Databricks has unbeatable Notebook environment for Spark development.

Conclusion

Databricks is a better platform for Big data(scala, pyspark) Developing.(unbeatable notebook environment)
SageMaker is better for Deployment. and if you are not working on big data, SageMaker is a perfect choice working with (Jupyter notebook + Sklearn + Mature containers + Super easy deployment).
SageMaker provides "real time inference", very easy to build and deploy, very impressive. you can check the official SageMaker Github. https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk/scikit_learn_inference_pipeline

108

answered Sep 22 '22 07:09

seninus

Having worked in both environments within the last year, I specifically remember:

Databricks having easy access to stored databases/tables to query out of and use Scala/Spark within the Jupyter Notebooks. I remember how nice it was to just see and preview the schemas and query quickly and be off to the races for research. I also remember the quick functionality to set up a timed job on a Notebook (re-run every month) and re-scale to job instance types (much cheaper) with some button clicks. These functionalities might exist somewhere in AWS, but I remember it being great in Databricks.
AWS SageMaker + Lambda + API Gateway: Legitimately, today, I worked through the deployment of AWS SageMaker + Lambda + API Gateway, and after getting used to some syntax and specifics of the Lambda + API Gateway it was pretty straightforward. Doing another AWS deployment wouldn't take more than 20 minutes (pending unique specificities). Other things like Model Monitoring and CloudWatch are nice as well. I did notice Jupyter Notebook Kernels for many languages like Python (what I did it in), R, and Scala, along with specific packages already pre-installed like conda and sagemaker ml packages and methods.

answered Sep 22 '22 07:09

kevin_theinfinityfund

Related questions
                            
                                Querying json object in dataframe using Pyspark
                            
                                Scala & Spark: Cast multiple columns at once
                            
                                How to parse CSV file with UTF-8 encoding?
                            
                                Spark on YARN + Secured hbase
                            
                                How to use --num-executors option with spark-submit?
                            
                                How to Generate Parquet File Using Pure Java (Including Date & Decimal Types) And Upload to S3 [Windows] (No HDFS)
                            
                                Pyspark 'NoneType' object has no attribute '_jvm' error
                            
                                DataFrame object has no attribute 'col'
                            
                                Pandas scalar UDF failing, IllegalArgumentException
                            
                                Storing a Graph in Spark Graphx with HDFS
                            
                                Apache Spark Exception in thread "main" java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
                            
                                How can I change spark ui port?
                            
                                Spark ALS predictAll returns empty
                            
                                withColumn not allowing me to use max() function to generate a new column
                            
                                how to join two DataFrame and replace one column conditionally in spark
                            
                                How to append to a csv file using df.write.csv in pyspark?
                            
                                Spark SQL statement broadcast
                            
                                IF Statement Pyspark
                            
                                Configure standalone spark for azure storage access
                            
                                Scala Spark - illegal start of definition

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Difference in usecases for AWS Sagemaker vs Databricks?

Tags:

apache-spark

pyspark

amazon-sagemaker

databricks

L Xandor

People also ask

2 Answers

seninus

kevin_theinfinityfund

Recent Activity

Donate For Us