I have a jupyter notebook with a standard template code like so
from sagemaker.tensorflow import TensorFlow
import sagemaker
from sagemaker import get_execution_role
sagemaker_session = sagemaker.Session()
role = get_execution_role()
tf_estimator = TensorFlow(entry_point='sagemaker_predict_2.py', role=role,
training_steps=10000, evaluation_steps=100,
train_instance_count=1, train_instance_type='ml.p2.xlarge',
framework_version='1.10.0')
tf_estimator.fit('s3://XXX-sagemaker/XXX')
This kicks off fine but eventually throws an error
2018-11-27 06:21:12 Starting - Starting the training job...
2018-11-27 06:21:15 Starting - Launching requested ML instances.........
2018-11-27 06:22:44 Starting - Preparing the instances for training...
2018-11-27 06:23:35 Downloading - Downloading input data...
2018-11-27 06:24:03 Training - Downloading the training image......
2018-11-27 06:25:12 Training - Training image download completed. Training in progress..
2018-11-27 06:25:11,813 INFO - root - running container entrypoint
2018-11-27 06:25:11,813 INFO - root - starting train task
2018-11-27 06:25:11,833 INFO - container_support.training - Training starting
2018-11-27 06:25:15,306 ERROR - container_support.training - uncaught exception during training: No module named keras
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/container_support/training.py", line 36, in start
fw.train()
File "/usr/local/lib/python2.7/dist-packages/tf_container/train_entry_point.py", line 143, in train
customer_script = env.import_user_module()
File "/usr/local/lib/python2.7/dist-packages/container_support/environment.py", line 101, in import_user_module
user_module = importlib.import_module(script)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/opt/ml/code/sagemaker_predict_2.py", line 7, in <module>
import keras
ImportError: No module named keras
My sagemaker_predict_2.py needs some of these libraries:
import pandas as pd
import numpy as np
import sys
import keras
from keras.models import Model, Input
from keras.layers import LSTM, Embedding, Dense, TimeDistributed, Dropout, Bidirectional
from keras.wrappers.scikit_learn import KerasClassifier
from keras_contrib.layers import CRF
I suppose it has no problem importing pandas and numpy, but dies when importing keras. I thought keras was standard in the notebook. When I kick this script off, does it have some other uninitialized environment?
Also, I believe keras_contrib is not standard, so I will need a way to install that. How do I do that?
I tried !pip install keras in the cell above but it reported that Requirement already satisfied, so it seems my jupyter environment has the library. But kicking off the sagemaker_predict_2.py must be in a different environment?
You are correct. The sagemaker_predict_2.py runs in a different environment from your notebook instance. That particular code runs on SageMaker executed inside of our predefined TensorFlow Docker container.
Installing your dependencies in the notebook instance will only allow access to the installed libraries within the notebook kernel.
As for installing your dependencies inside of the running Docker container, that can be achieved by specifying your dependencies in a requirements.txt.
Since iterations can take from 8-10 minutes, it is recommended to use local mode to make sure that your training job can run locally before sending the training job to SageMaker. This can be done by specifying the training_instance_type as 'local' or please reference this notebook: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_distributed_mnist/tensorflow_local_mode_mnist.ipynb
What local mode does is essentially run your docker container on the localhost that is executing the Python code. This can be on our SageMaker notebook instance or your own local machine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With