Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cluster computing using starcluster and ipython on AWS

I'm trying to experiment some with cluster computing on AWS. I'm completely new at this and having some issues. I'm trying to follow the tutorial found here: http://star.mit.edu/cluster/docs/latest/plugins/ipython.html#using-the-ipython-cluster. I use starcluster to start a cluster instance with the following:

starcluster start mycluster

Everything comes up as expected and it shows that the ipython plugin has loaded. I then try to execute the following command as shown in the tutorial:

starcluster sshmaster mycluster -u myuser

The connection fails, however, and tells me

Permission denied (publickey).

I am able to log in using

starcluster sshmaster mycluster

so I attempted to continue the tutorial logged in to the master but when I try to create the Client I receive and error:

AssertionError: Not a valid connection file or url: 
u'/root/.ipython/profile_default/security/ipcontroller-client.json'

The only thing that I saw that seemed out of the ordinary is when the cluster was starting up this appeared:

>>> Running plugin ipcluster
>>> Writing IPython cluster config files
>>> Starting IPython cluster with 7 engines
>>> Waiting for JSON connector file... 
>>> Creating IPCluster cache directory: /Users/username/.starcluster/ipcluster
>>> Saving JSON connector file to '/Users/username/.starcluster/ipcluster/mycluster-us-east-1.json'
!!! ERROR - Error occurred while running plugin 'ipcluster':
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/cluster.py", line 1506, in run_plugin
    func(*args)
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/plugins/ipcluster.py", line 276, in run
    plug.run(nodes, master, user, user_shell, volumes)
  File "<string>", line 2, in run
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/utils.py", line 87, in wrap_f
    res = func(*arg, **kargs)
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/plugins/ipcluster.py", line 228, in run
    cfile = self._start_cluster(master, n, profile_dir)
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/plugins/ipcluster.py", line 173, in _start_cluster
    master.ssh.get(json, local_json)
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/sshutils/__init__.py", line 431, in get
    self.scp.get(remotepaths, localpath, recursive=recursive)
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/sshutils/scp.py", line 141, in get
    self._recv_all()
  File "/Library/Python/2.7/site-packages/StarCluster-0.93.3-py2.7.egg/starcluster/sshutils/scp.py", line 242, in _recv_all
    msg = self.channel.recv(1024)
  File "build/bdist.macosx-10.8-intel/egg/ssh/channel.py", line 611, in recv
    raise socket.timeout()
timeout

Any thoughts?

like image 391
user1074057 Avatar asked Sep 15 '12 22:09

user1074057


1 Answers

The tutorial assumes CLUSTER_USER = myuser in ~/.starcluster/config even though by default CLUSTER_USER = sgeadmin

like image 90
Ishan Arora Avatar answered Oct 20 '22 04:10

Ishan Arora