Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pros and Cons of Amazon SageMaker VS. Amazon EMR, for deploying TensorFlow-based deep learning models?

I want to build some neural network models for NLP and recommendation applications. The framework I want to use is TensorFlow. I plan to train these models and make predictions on Amazon web services. The application will be most likely distributed computing.

I am wondering what are the pros and cons of SageMaker and EMR for TensorFlow applications?

They both have TensorFlow integrated.

like image 517
CyberPlayerOne Avatar asked Sep 21 '18 06:09

CyberPlayerOne


People also ask

What are the limitations of SageMaker?

SageMaker does not allow you to schedule training jobs. SageMaker does not provide a mechanism for easily tracking metrics logged during training. We often fit feature extraction and model pipelines. We can inject the model artifacts into AWS-provided containers, but we cannot inject the feature extractors.

What is the main reason to use AWS SageMaker?

Amazon SageMaker is a fully managed machine learning service. With SageMaker, data scientists and developers can quickly and easily build and train machine learning models, and then directly deploy them into a production-ready hosted environment.

Does SageMaker use EMR?

Amazon SageMaker Studio now enables interactive data preparation and machine learning at scale within a single universal notebook through built-in integration with Amazon EMR.

What are the benefits of SageMaker?

The Amazon SageMaker helps in the acceleration of Machine Learning development by reducing the training time from hours to minutes with further optimized infrastructure. It also helps in boosting the team productivity up to 10 times with the purpose-built tools.


2 Answers

From AWS documentation:

Amazon EMR is a managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark, on AWS to process and analyze vast amounts of data. By using these frameworks and related open-source projects, such as Apache Hive and Apache Pig, you can process data for analytics purposes and business intelligence workloads. Additionally, you can use Amazon EMR to transform and move large amounts of data into and out of other AWS data stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB.

(...) Amazon SageMaker is a fully-managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. Amazon SageMaker removes all the barriers that typically slow down developers who want to use machine learning.

Conclussion: If you want to deploy AI models just use AWS SageMaker

like image 36
BSP Avatar answered Sep 24 '22 13:09

BSP


In general terms, they serve different purposes.

EMR is when you need to process massive amounts of data and heavily rely on Spark, Hadoop, and MapReduce (EMR = Elastic MapReduce). Essentially, if your data is in large enough volume to make use of the efficiencies of Spark, Hadoop, Hive, HDFS, HBase and Pig stack then go with EMR.

EMR Pros:

  • Generally, low cost compared to EC2 instances
  • As the name suggests Elastic meaning you can provision what you need when you need it
  • Hive, Pig, and HBase out of the box

EMR Cons:

  • You need a very specific use case to truly benefit from all the offerings in EMR. Most don't take advantage of its entire offering

SageMaker is an attempt to make Machine Learning easier and distributed. The marketplace provides out of the box algos and models for quick use. It's a great service if you conform to the workflows it enforces. Meaning creating training jobs, deploying inference endpoints

SageMaker Pros:

  • Easy to get up and running with Notebooks
  • Rich marketplace to quickly try existing models
  • Many different example notebooks for popular algorithms
  • Predefined kernels that minimize configuration
  • Easy to deploy models
  • Allows you to distribute inference compute by deploying endpoints

SageMaker Cons:

  • Expensive!
  • Enforces a certain workflow making it hard to be fully custom
  • Expensive!
like image 86
IsakBosman Avatar answered Sep 25 '22 13:09

IsakBosman