Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save models trained locally in Amazon SageMaker?

I'm trying to use a local training job in SageMaker.

Following this AWS notebook (https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/mxnet_gluon_mnist/mxnet_mnist_with_gluon_local_mode.ipynb) I was able to train and predict locally.

There is any way to train locally and save the trained model in the Amazon SageMaker Training Job section? Otherwise, how can I properly save trained models I trained using local mode?

like image 638
bcosta12 Avatar asked Jul 28 '20 16:07

bcosta12


People also ask

Can you use SageMaker locally?

SageMaker Pipelines local mode is an easy way to test your training, processing and inference scripts, as well as the runtime compatibility of pipeline parameters before you execute your pipeline on the managed SageMaker service. By using local mode, you can test your SageMaker pipeline locally using a smaller dataset.


Video Answer


2 Answers

There is no way to have your local mode training jobs appear in the AWS console. The intent of local mode is to allow for faster iteration/debugging before using SageMaker for training your model.

You can create SageMaker Models from local model artifacts. Compress your model artifacts into a .tar.gz file, upload that file to S3, and then create the Model (with the SDK or in the console).

Documentation:

  • https://sagemaker.readthedocs.io/en/stable/overview.html#using-models-trained-outside-of-amazon-sagemaker
  • https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html
like image 60
lauren Avatar answered Oct 19 '22 10:10

lauren


As @lauren said, just compress it and creates your model. Once you local trained it, you don’t have to save it as a training job since you already have the artifacts for a model.

Training jobs are a combination of input_location, output_location, chosen algorithm, and hyperparameters. That’s what is saved on a training job and not a trained model. When a training job completes, it actually compress the artifacts and save your model in Amazon S3 so you can create a Model out of it.

So, since you trained locally (instead of decoupling the training step), create a model with the compressed artifacts, then create an endpoint, and do some inferences.

like image 34
Paulo Aragão Avatar answered Oct 19 '22 10:10

Paulo Aragão