Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS SageMaker Minimum Configuration

Why do I need Container for AWS SageMaker? If I want to run Scikit Learn on SageMaker's Jupyter notebook for self learning purposes, do I still need to configure Container for it?

What is the minimum configuration on SageMaker I will need if I just want to learn Scikit Learn? For example, I want to run Scikit Learn's Decision Tree algorithm with a set of training data and a set of test data. What do I need to do on SageMaker to perform the tasks? Thanks.

like image 583
David293836 Avatar asked May 12 '18 04:05

David293836


People also ask

What are the limitations of SageMaker?

Maximum number of feature definitions per feature group: 2500. Maximum Transactions per second (TPS) per API per AWS account: Soft limit of 10000 TPS per API excluding the BatchGetRecord API call, which has a soft limit of 500 TPS. Maximum size of a record: 350KB.

Can I run SageMaker locally?

SageMaker Pipelines local mode is an easy way to test your training, processing and inference scripts, as well as the runtime compatibility of pipeline parameters before you execute your pipeline on the managed SageMaker service. By using local mode, you can test your SageMaker pipeline locally using a smaller dataset.

Can I use AWS SageMaker for free?

Amazon SageMaker is free to try. As part of the AWS Free Tier, you can get started with Amazon SageMaker for free. Your free tier starts from the first month when you create your first SageMaker resource.

What does SageMaker run on?

Amazon SageMaker creates a fully managed ML instance in Amazon Elastic Compute Cloud (EC2). It supports the open source Jupyter Notebook web application that enables developers to share live code. SageMaker runs Jupyter computational processing notebooks.


2 Answers

You don't need much. Just an AWS Account with the correlated permissions on your role. Inside the AWS SageMaker Console you can just run an AWS Notebook Instance with one click. There is Sklearn preinstalled and you can use it out of the box. No special container needed.

As minimum you just need your AWS Account with the correlated permissions to create EC2 Instances and read / write from your S3. Thats all, just try it. :)

Use this as a starting point: Amazon SageMaker – Accelerating Machine Learning

You can also access it via the Jupyter Terminal

like image 67
Pablo Avatar answered Oct 16 '22 07:10

Pablo


If you are not concerned about using Sagemaker's training and deployment features then you just need to create a new conda_python3 notebook and import sklearn.

I too was confused about how to take advantage of Sagemaker's train/deploy features with Scikit Learn. The best explanation and most up to date seems to be:

https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/sklearn/README.rst

The brief summary is:

  1. You save your training data to an S3 bucket.
  2. Create a standalone python script that does your training, serializes the training model to a file and saves it to an S3 bucket.
  3. In a notebook on Sagemaker you import the Sagemaker SDK and point it to your training script and data. Sagemaker will then temporarily create an AWS instance to train the model.
  4. Once trained that instance gets automatically destroyed.
  5. Finally you use the Sagemaker SDK to deploy the trained model to another AWS instance. This also automatically creates an endpoint that can be called to make predictions.
like image 22
Guy C Avatar answered Oct 16 '22 05:10

Guy C