I'm trying to run distributed tensorflow on an EMR/EC2 cluster but I don't know how to specify different instances in the cluster to run parts of the code. In the documentation, they've used <code>tf.device("/gpu:0")</code> to specify a gpu. But what if I have a master <code>CPU</code> and 5 different slaves <code>GPU</code> instances running in an EMR cluster and I want to specify those GPUs to run some code? I can't input <code>tf.device()</code> with the public DNS names of the instances because it throws an error saying the name cannot be resolved.

Since your question, AWS has released some code to ease the use of distributed TensorFlow on an EC2 cluster. See this github repository. Everything is described in the README.md but the short story is that it will create an AWS stack with <ul> <li>Security Groups</li> <li>Elastic File System</li> <li>EC2 instances with the AWS deep learning AMI and the EFS mounted on them,</li> <li>The EC2 instances will be configured so you can easily run a distributed tensorflow run by running a command on the master node (see the Running Distributed Training on TensorFlow section).</li> </ul>

How to run TensorFlow on an AWS cluster?

Tags:

python

amazon-web-services

amazon-ec2

tensorflow

I'm trying to run distributed tensorflow on an EMR/EC2 cluster but I don't know how to specify different instances in the cluster to run parts of the code.

In the documentation, they've used tf.device("/gpu:0") to specify a gpu. But what if I have a master CPU and 5 different slaves GPU instances running in an EMR cluster and I want to specify those GPUs to run some code? I can't input tf.device() with the public DNS names of the instances because it throws an error saying the name cannot be resolved.

542

asked Jul 13 '16 21:07

charmander

1 Answers

Since your question, AWS has released some code to ease the use of distributed TensorFlow on an EC2 cluster.

See this github repository. Everything is described in the README.md but the short story is that it will create an AWS stack with

Security Groups
Elastic File System
EC2 instances with the AWS deep learning AMI and the EFS mounted on them,
The EC2 instances will be configured so you can easily run a distributed tensorflow run by running a command on the master node (see the Running Distributed Training on TensorFlow section).

answered Sep 17 '22 18:09

pfm

Related questions
                            
                                How to choose the value and label from Django ModelChoiceField queryset
                            
                                Changing password in Django Admin
                            
                                Numpy Broadcast to perform euclidean distance vectorized
                            
                                How to extract tables from websites in Python
                            
                                Write to a csv file scrapy
                            
                                Filter a Set for Matching String Permutations
                            
                                pip: downloading dependencies to specific platform including non binaries
                            
                                Why is __new__ a staticmethod and not a classmethod? [duplicate]
                            
                                Compiling cython's HelloWorld Example: don't know how to compile C/C++ code on platform
                            
                                Captured variables in "eval" in Python
                            
                                Non-ASCII symbols in translation of GTK-GUI in Windows not working?
                            
                                Capture the "Save your changes" dialog when using win32gui and closing an embedded application within a QApplication?
                            
                                Trouble plotting with PyOpenGL
                            
                                Customizing error message for specific exceptions in pytest

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With