I am trying to follow the tutorial here to implement a custom inference pipeline for feature preprocessing. It uses the python sklearn sdk to bring in custom preprocessing pipeline from a script. For example:
from sagemaker.sklearn.estimator import SKLearn
script_path = 'preprocessing.py'
sklearn_preprocessor = SKLearn(
entry_point=script_path,
role=role,
train_instance_type="ml.c4.xlarge",
sagemaker_session=sagemaker_session)
However I can't find a way to send multiple files. The reason I need multiple files is because I have a custom class used in the sklearn pipeline needs to be imported from a custom module. Without importing, it raises error AttributeError: module '__main__' has no attribute 'CustomClassName'
when having the custom class in the same preprocessing.py file due to the way pickle works (at least I think it's related to pickle).
Anyone know if sending multiple files is even possible?
Newbie to Sagemaker, thanks!!
You can use Amazon SageMaker to train and deploy a model using custom Scikit-learn code. The SageMaker Python SDK Scikit-learn estimators and models and the SageMaker open-source Scikit-learn containers make writing a Scikit-learn script and running it in SageMaker easier.
An inference pipeline is a Amazon SageMaker model that is composed of a linear sequence of two to fifteen containers that process requests for inferences on data.
This is where an Amazon SageMaker endpoint steps in – an Amazon SageMaker endpoint is a fully managed service that allows you to make real-time inferences via a REST API.
At the moment SageMaker Inference has four main options: Real-Time Inference, Batch Inference, Asynchronous Inference, and now Serverless Inference.
There's a source_dir parameter which will "lift" a directory of files to the container and put it on your import path.
You're entrypoint script should be put there to and referenced from that location.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With