I don't think I'm asking this question right but I have jupyter notebook that launches a Tensorflow training job with a python training script I wrote.
That training script requires certain modules. Seems my sagemaker training job is failing because some of the modules don't exist.
How can I ensure that my training job script has all the modules it needs?
Edit
An example of one of these modules is keras
.
The odd thing is, I can import keras
in the jupyter notebook, but when that import statement is in my training script then I get the No module named keras
error
SageMaker notebooks support the following package installation tools: conda install. pip install.
To train a model in SageMaker, you create a training job. The training job includes the following information: The URL of the Amazon Simple Storage Service (Amazon S3) bucket where you've stored the training data. The compute resources that you want SageMaker to use for model training.
SageMaker Training Job model data is saved to . tar. gz files in S3, however if you have local data you want to deploy, you can prepare the data yourself.
If you want to install multiple packages, one way is to upgrade to Sagemaker Python SDK v2. With this, you can create a requirements.txt
in the same directory as your notebook, and run the training. Sagemaker will automatically take care of the installation.
If you want to stay on v1 SDK, you can add the following snippet to your entry_point script.
import subprocess
import sys
def install(package):
subprocess.check_call([sys.executable, "-q", "-m", "pip", "install", package])
install('keras')
The module script runs within a docker container which obviously does not have the dependency installed. Jupyter notebook on the other hand has keras pre-installed. Easy way to do this is to have a requirements.txt file with all the requirements and then pass that on when creating your model.
env = {
'SAGEMAKER_REQUIREMENTS': 'requirements.txt', # path relative to `source_dir` below.
}
sagemaker_model = TensorFlowModel(model_data = 's3://mybucket/modelTarFile,
role = role,
entry_point = 'entry.py',
code_location = 's3://mybucket/runtime-code/',
source_dir = 'src',
env = env,
name = 'model_name',
sagemaker_session = sagemaker_session,
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With