Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Sagemaker SKlearn entry point allow multiple script

I am trying to follow the tutorial here to implement a custom inference pipeline for feature preprocessing. It uses the python sklearn sdk to bring in custom preprocessing pipeline from a script. For example:

from sagemaker.sklearn.estimator import SKLearn

script_path = 'preprocessing.py'

sklearn_preprocessor = SKLearn(
    entry_point=script_path,
    role=role,
    train_instance_type="ml.c4.xlarge",
    sagemaker_session=sagemaker_session)

However I can't find a way to send multiple files. The reason I need multiple files is because I have a custom class used in the sklearn pipeline needs to be imported from a custom module. Without importing, it raises error AttributeError: module '__main__' has no attribute 'CustomClassName' when having the custom class in the same preprocessing.py file due to the way pickle works (at least I think it's related to pickle).

Anyone know if sending multiple files is even possible?

Newbie to Sagemaker, thanks!!

like image 650
Wayne Yu Avatar asked Jan 22 '19 19:01

Wayne Yu


People also ask

How do you use Sklearn in SageMaker?

You can use Amazon SageMaker to train and deploy a model using custom Scikit-learn code. The SageMaker Python SDK Scikit-learn estimators and models and the SageMaker open-source Scikit-learn containers make writing a Scikit-learn script and running it in SageMaker easier.

How many containers can a SageMaker inference pipeline support?

An inference pipeline is a Amazon SageMaker model that is composed of a linear sequence of two to fifteen containers that process requests for inferences on data.

What is SageMaker end point?

This is where an Amazon SageMaker endpoint steps in – an Amazon SageMaker endpoint is a fully managed service that allows you to make real-time inferences via a REST API.

What are the inference options provided by SageMaker?

At the moment SageMaker Inference has four main options: Real-Time Inference, Batch Inference, Asynchronous Inference, and now Serverless Inference.


1 Answers

There's a source_dir parameter which will "lift" a directory of files to the container and put it on your import path.

You're entrypoint script should be put there to and referenced from that location.

like image 191
sniggatooth Avatar answered Sep 29 '22 04:09

sniggatooth