Setting up AWS EC2 instance with Tensorflow 2.0 -- AMI versus building it yourself?

Tags:

I need to setup an AWS EC2 GPU instance with Tensorflow 2.0. All of the docs that I have seen indicate that the current AWS AMI images only support Tensorflow 1.14 or 1.15, but not Tensorflow 2.0. Hence I was wondering what is the best way to get Tensorflow-gpu 2.0 on an AWS instance.

I could create an EC2 GPU instance, install the Nvidia drivers, and then install a docker instance using nvidia-docker and Tensorflow 2.0. Or is it easier to just install an AWS AMI image with Tensorflow 1.14 and then upgrade to Tensorflow 2.0? It is not clear which approach makes more sense.

Any suggestions would be welcome.

802

asked Nov 08 '19 04:11

krishnab

1 Answers

So I went through both routes. Right now I would say that setting up a docker container with Tensorflow 2.0 is easier than building from the AMI image.

For the docker route, you can spin up an Ubuntu 18.04 instance with GPUs. Then you have to follow the following steps. Now I lay out the basic steps but did not go into great detail. But hopefully this is enough guidance to help someone get started.

Startup the instance and install the docker-ce software. Make sure that network port 8888 is accessible for incoming connections.
Install the nvidia drivers for the particular GPU instance: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html
Install the nvidia-docker software from the Nvidia github repository. This will enable the docker image to access the GPU drivers on the EC2 instance.
Download and run the tensorflow 2.0 container with the command: docker run -it --gpus all --rm -v $(realpath ~/Downloads):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:2.0.0-gpu-py3-jupyter

This should initiate a notebook that the user can access it from their computer.

If you want to do this through an AMI image, you basically have to install the Tensorflow 1.14 image and then upgrade it. This is actually harder than it looks. Again this is a high level outline of the steps, but I tried to include links or code as best I could.

Setup ubuntu 18.04 Deep Learning AMI on the server (25.2).
Update and upgrade ubuntu:

    sudo apt-get update
    sudo apt-get upgrade

Update the Anaconda distribution, since the current distribution uses a broker version of the package manager.

conda update conda
conda update --all

Create a tensorflow 2.0 conda environment

conda create -n tf2 python=3.7 tensorflow-gpu==2.0 cudatoolkit cudnn jupyter

Initialize conda in the shell. You have to do this to use conda commands from the shell. You might need to exit out of the instance and then ssh back into it.

conda init bash
bash

Install the environment_kernels package

pip install environment_kernels

Install the jupyter notebook extensions

conda install -c conda-forge jupyter_contrib_nbextensions

Install the Jupyter server on the instance. Follow the instructions on the link: https://docs.aws.amazon.com/dlami/latest/devguide/setup-jupyter-config.html
ssh into the instance and start the Jupyter server.

ssh -N -f -L 8888:localhost:8888 ubuntu@aws-public-url

Open a browser on your computer and browse to the public URL:8888 for that server.

Hence I would say use the first approach rather than the second approach, until Amazon releases a Tensorflow 2.0 AMI.

126

answered Oct 16 '22 15:10

krishnab

Related questions
                            
                                Redux vs custom hook
                            
                                Writing a Testcafe test to assert a loading spinner is visible after making a fetch request
                            
                                Cosmos DB paging performance with OFFSET and LIMIT
                            
                                Optimizing Xcode Build time when using Firebase library
                            
                                find the largest area in this 2d array
                            
                                Keep getting "Proxy error: Could not proxy request" error after adding proxy to react package.json
                            
                                Google SignIn In Cognito Using Custom Login Form
                            
                                How can i convert mnist data to RGB format?
                            
                                generate (overlapping) sets of mutually similar elements from binary similarity matrix [duplicate]
                            
                                Problem with Handling different screen sizes in Android-React native
                            
                                Apply style to Alert Dialogs
                            
                                Two-phase function template compilation: not *only* ADL is employed in the 2nd phase?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Setting up AWS EC2 instance with Tensorflow 2.0 -- AMI versus building it yourself?

Tags:

krishnab

People also ask

1 Answers

krishnab

Recent Activity

Donate For Us