Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does spark-ec2 fail with ERROR: Could not find any existing cluster?

I have recently downloaded Spark and I am attempting to access my first cluster through Spark-ec2. I used the commands:

export AWS_ACCESS_KEY_ID=<myid>
export AWS_SECRET_ACCESS_KEY=<mykey>
./spark-ec2 -k my-key-pair -i my-key-pair.pem -s 2 -t m1.small -w 360 launch Spark

And the startup appears to run without error. However when I run:

./spark-ec2 -k my-key-pair -i my-key-pair.pem login Spark

it returns:

Searching for existing cluster Spark...
ERROR: Could not find any existing cluster

I cannot find any documentation on this error. Any help on how to proceed would be greatly appreciated.

Start up log (again I cleaned up the Spark_1 vs Spark for clarity):

Setting up security groups...
Creating security group Spark-master
Creating security group Spark-slaves
Searching for existing cluster Spark...
Spark AMI: ami-41642728
Launching instances...
Launched 2 slaves in us-east-1b, regid = r-f6a069d8
Launched master in us-east-1b, regid = r-3ea06910
Waiting for instances to start up...
Waiting 360 more seconds...
Copying SSH key my-key-pair.pem to master...
Warning: Permanently added 'ec2-54-236-251-167.compute-1.amazonaws.com,54.236.251.167' (RSA) to the list of known hosts.
Connection to ec2-54-236-251-167.compute-1.amazonaws.com closed.
Connection to ec2-54-236-251-167.compute-1.amazonaws.com closed.
Cloning into 'spark-ec2'...
remote: Counting objects: 1171, done.
remote: Compressing objects: 100% (564/564), done.
remote: Total 1171 (delta 374), reused 1162 (delta 365)
Receiving objects: 100% (1171/1171), 186.09 KiB, done.
Resolving deltas: 100% (374/374), done.
Connection to ec2-54-236-251-167.compute-1.amazonaws.com closed.
Deploying files to master...
building file list ... done
root/spark-ec2/ec2-variables.sh

sent 1509 bytes  received 42 bytes  1034.00 bytes/sec
total size is 1368  speedup is 0.88
Running setup on master...
Connection to ec2-54-236-251-167.compute-1.amazonaws.com closed.
Setting up Spark on ip-172-31-17-14.ec2.internal...
Setting executable permissions on scripts...
Running setup-slave on master to mount filesystems, etc...
Setting up slave on ip-172-31-17-14.ec2.internal...
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 18.5205 s, 58.0 MB/s
mkswap: /mnt/swap: warning: don't erase bootbits sectors
        on whole disk. Use -f to force.
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=f766ec2e-1c37-4267-90ad-acde24a759d8
Added 1024 MB swap file /mnt/swap
SSH'ing to master machine(s) to approve key(s)...
ec2-54-236-251-167.compute-1.amazonaws.com
Warning: Permanently added 'ec2-54-236-251-167.compute-1.amazonaws.com,172.31.17.14' (RSA) to the list of known hosts.
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Warning: Permanently added 'ip-172-31-17-14.ec2.internal' (RSA) to the list of known hosts.
SSH'ing to other cluster nodes to approve keys...
ec2-54-236-239-94.compute-1.amazonaws.com
Warning: Permanently added 'ec2-54-236-239-94.compute-1.amazonaws.com,172.31.24.198' (RSA) to the list of known hosts.
ec2-54-236-245-195.compute-1.amazonaws.com
Warning: Permanently added 'ec2-54-236-245-195.compute-1.amazonaws.com,172.31.24.199' (RSA) to the list of known hosts.
RSYNC'ing /root/spark-ec2 to other cluster nodes...
ec2-54-236-239-94.compute-1.amazonaws.com
id_rsa                                                                                                                                           100% 1692     1.7KB/s   00:00
ec2-54-236-245-195.compute-1.amazonaws.com
id_rsa                                                                                                                                           100% 1692     1.7KB/s   00:00
Running slave setup script on other cluster nodes...
ec2-54-236-239-94.compute-1.amazonaws.com
Setting up slave on ip-172-31-24-198.ec2.internal...
ec2-54-236-245-195.compute-1.amazonaws.com
Setting up slave on ip-172-31-24-199.ec2.internal...
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 16.9615 s, 63.3 MB/s
mkswap: /mnt/swap: warning: don't erase bootbits sectors
        on whole disk. Use -f to force.
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=5fc9f216-7901-4753-ba10-103898a0168c
Added 1024 MB swap file /mnt/swap
Connection to ec2-54-236-239-94.compute-1.amazonaws.com closed.
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 21.932 s, 49.0 MB/s
mkswap: /mnt/swap: warning: don't erase bootbits sectors
        on whole disk. Use -f to force.
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=b4ae6967-1bb3-415e-92cd-b667cb184a57
Added 1024 MB swap file /mnt/swap
Connection to ec2-54-236-245-195.compute-1.amazonaws.com closed.
Initializing spark
~ ~/spark-ec2
--2014-01-16 19:05:57--  http://d3kbcqa49mib13.cloudfront.net/spark-0.8.0-incubating-bin-hadoop1.tgz
Resolving d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)... 54.230.103.217, 216.137.33.65, 216.137.33.222, ...
Connecting to d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)|54.230.103.217|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 133589594 (127M) [application/x-compressed]
Saving to: ‘spark-0.8.0-incubating-bin-hadoop1.tgz’

100%[=========================================================================================================================================>] 133,589,594 33.6MB/s   in 3.9s

2014-01-16 19:06:01 (32.9 MB/s) - ‘spark-0.8.0-incubating-bin-hadoop1.tgz’ saved [133589594/133589594]

Unpacking Spark
~/spark-ec2
Initializing shark
~ ~/spark-ec2
--2014-01-16 19:06:27--  http://d3kbcqa49mib13.cloudfront.net/shark-0.8.0-bin-hadoop1-ec2.tgz
Resolving d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)... 54.230.103.93, 54.230.103.217, 216.137.33.65, ...
Connecting to d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)|54.230.103.93|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 79340270 (76M) [application/x-compressed]
Saving to: ‘shark-0.8.0-bin-hadoop1-ec2.tgz’

100%[=========================================================================================================================================>] 79,340,270  33.4MB/s   in 2.3s

2014-01-16 19:06:30 (33.4 MB/s) - ‘shark-0.8.0-bin-hadoop1-ec2.tgz’ saved [79340270/79340270]

Unpacking Shark
~/spark-ec2
Initializing ephemeral-hdfs
~ ~/spark-ec2
--2014-01-16 19:06:36--  http://d3kbcqa49mib13.cloudfront.net/hadoop-1.0.4.tar.gz
Resolving d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)... 54.230.102.206, 54.230.103.93, 54.230.103.217, ...
Connecting to d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)|54.230.102.206|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62793050 (60M) [application/x-gzip]
Saving to: ‘hadoop-1.0.4.tar.gz’

100%[=========================================================================================================================================>] 62,793,050  33.8MB/s   in 1.8s

2014-01-16 19:06:38 (33.8 MB/s) - ‘hadoop-1.0.4.tar.gz’ saved [62793050/62793050]

Unpacking Hadoop
RSYNC'ing /root/ephemeral-hdfs to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
~/spark-ec2
Initializing persistent-hdfs
~ ~/spark-ec2
--2014-01-16 19:08:04--  http://d3kbcqa49mib13.cloudfront.net/hadoop-1.0.4.tar.gz
Resolving d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)... 54.230.100.174, 54.230.101.43, 54.230.101.104, ...
Connecting to d3kbcqa49mib13.cloudfront.net (d3kbcqa49mib13.cloudfront.net)|54.230.100.174|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62793050 (60M) [application/x-gzip]
Saving to: ‘hadoop-1.0.4.tar.gz’

100%[=========================================================================================================================================>] 62,793,050  31.3MB/s   in 1.9s

2014-01-16 19:08:06 (31.3 MB/s) - ‘hadoop-1.0.4.tar.gz’ saved [62793050/62793050]

Unpacking Hadoop
RSYNC'ing /root/persistent-hdfs to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
~/spark-ec2
Initializing spark-standalone
Initializing ganglia
Connection to ec2-54-236-239-94.compute-1.amazonaws.com closed.
Connection to ec2-54-236-245-195.compute-1.amazonaws.com closed.
Creating local config files...
Connection to ec2-54-236-239-94.compute-1.amazonaws.com closed.
Configuring /etc/ganglia/gmond.conf
Configuring /etc/ganglia/gmetad.conf
Configuring /etc/httpd/conf.d/ganglia.conf
Configuring /etc/httpd/conf/httpd.conf
Configuring /root/mapreduce/hadoop.version
Configuring /root/mapreduce/conf/core-site.xml
Configuring /root/mapreduce/conf/slaves
Configuring /root/mapreduce/conf/mapred-site.xml
Configuring /root/mapreduce/conf/hdfs-site.xml
Configuring /root/mapreduce/conf/hadoop-env.sh
Configuring /root/mapreduce/conf/masters
Configuring /root/persistent-hdfs/conf/core-site.xml
Configuring /root/persistent-hdfs/conf/slaves
Configuring /root/persistent-hdfs/conf/mapred-site.xml
Configuring /root/persistent-hdfs/conf/hdfs-site.xml
Configuring /root/persistent-hdfs/conf/hadoop-env.sh
Configuring /root/persistent-hdfs/conf/masters
Configuring /root/ephemeral-hdfs/conf/core-site.xml
Configuring /root/ephemeral-hdfs/conf/slaves
Configuring /root/ephemeral-hdfs/conf/mapred-site.xml
Configuring /root/ephemeral-hdfs/conf/hadoop-metrics2.properties
Configuring /root/ephemeral-hdfs/conf/hdfs-site.xml
Configuring /root/ephemeral-hdfs/conf/hadoop-env.sh
Configuring /root/ephemeral-hdfs/conf/masters
Configuring /root/spark/conf/core-site.xml
Configuring /root/spark/conf/spark-env.sh
Configuring /root/tachyon/conf/slaves
Configuring /root/tachyon/conf/tachyon-env.sh
Configuring /root/shark/conf/shark-env.sh
Deploying Spark config files...
RSYNC'ing /root/spark/conf to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
Setting up spark
RSYNC'ing /root/spark to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
Setting up shark
RSYNC'ing /root/shark to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
RSYNC'ing /root/hive-0.9.0-bin to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
Setting up ephemeral-hdfs
~/spark-ec2/ephemeral-hdfs ~/spark-ec2
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
Connection to ec2-54-236-239-94.compute-1.amazonaws.com closed.
Connection to ec2-54-236-245-195.compute-1.amazonaws.com closed.
RSYNC'ing /root/ephemeral-hdfs/conf to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
Formatting ephemeral HDFS namenode...
Warning: $HADOOP_HOME is deprecated.

14/01/16 19:11:10 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ip-172-31-17-14.ec2.internal/172.31.17.14
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
14/01/16 19:11:11 INFO util.GSet: VM type       = 64-bit
14/01/16 19:11:11 INFO util.GSet: 2% max memory = 19.33375 MB
14/01/16 19:11:11 INFO util.GSet: capacity      = 2^21 = 2097152 entries
14/01/16 19:11:11 INFO util.GSet: recommended=2097152, actual=2097152
14/01/16 19:11:12 INFO namenode.FSNamesystem: fsOwner=root
14/01/16 19:11:13 INFO namenode.FSNamesystem: supergroup=supergroup
14/01/16 19:11:13 INFO namenode.FSNamesystem: isPermissionEnabled=false
14/01/16 19:11:13 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
14/01/16 19:11:13 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
14/01/16 19:11:13 INFO namenode.NameNode: Caching file names occuring more than 10 times
14/01/16 19:11:13 INFO common.Storage: Image file of size 110 saved in 0 seconds.
14/01/16 19:11:13 INFO common.Storage: Storage directory /mnt/ephemeral-hdfs/dfs/name has been successfully formatted.
14/01/16 19:11:13 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip-172-31-17-14.ec2.internal/172.31.17.14
************************************************************/
Starting ephemeral HDFS...
./ephemeral-hdfs/setup.sh: line 31: /root/ephemeral-hdfs/sbin/start-dfs.sh: No such file or directory
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /mnt/ephemeral-hdfs/logs/hadoop-root-namenode-ip-172-31-17-14.ec2.internal.out
ec2-54-236-239-94.compute-1.amazonaws.com: Warning: $HADOOP_HOME is deprecated.
ec2-54-236-239-94.compute-1.amazonaws.com:
ec2-54-236-239-94.compute-1.amazonaws.com: starting datanode, logging to /mnt/ephemeral-hdfs/logs/hadoop-root-datanode-ip-172-31-24-198.ec2.internal.out
ec2-54-236-245-195.compute-1.amazonaws.com: Warning: $HADOOP_HOME is deprecated.
ec2-54-236-245-195.compute-1.amazonaws.com:
ec2-54-236-245-195.compute-1.amazonaws.com: starting datanode, logging to /mnt/ephemeral-hdfs/logs/hadoop-root-datanode-ip-172-31-24-199.ec2.internal.out
ec2-54-236-251-167.compute-1.amazonaws.com: Warning: $HADOOP_HOME is deprecated.
ec2-54-236-251-167.compute-1.amazonaws.com:
ec2-54-236-251-167.compute-1.amazonaws.com: starting secondarynamenode, logging to /mnt/ephemeral-hdfs/logs/hadoop-root-secondarynamenode-ip-172-31-17-14.ec2.internal.out
~/spark-ec2
Setting up persistent-hdfs
~/spark-ec2/persistent-hdfs ~/spark-ec2
Pseudo-terminal will not be allocated because stdin is not a terminal.
Pseudo-terminal will not be allocated because stdin is not a terminal.
RSYNC'ing /root/persistent-hdfs/conf to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
Formatting persistent HDFS namenode...
Warning: $HADOOP_HOME is deprecated.

14/01/16 19:11:32 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ip-172-31-17-14.ec2.internal/172.31.17.14
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
14/01/16 19:11:33 INFO util.GSet: VM type       = 64-bit
14/01/16 19:11:33 INFO util.GSet: 2% max memory = 19.33375 MB
14/01/16 19:11:33 INFO util.GSet: capacity      = 2^21 = 2097152 entries
14/01/16 19:11:33 INFO util.GSet: recommended=2097152, actual=2097152
14/01/16 19:11:35 INFO namenode.FSNamesystem: fsOwner=root
14/01/16 19:11:36 INFO namenode.FSNamesystem: supergroup=supergroup
14/01/16 19:11:36 INFO namenode.FSNamesystem: isPermissionEnabled=false
14/01/16 19:11:36 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
14/01/16 19:11:36 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
14/01/16 19:11:36 INFO namenode.NameNode: Caching file names occuring more than 10 times
14/01/16 19:11:36 INFO common.Storage: Image file of size 110 saved in 0 seconds.
14/01/16 19:11:36 INFO common.Storage: Storage directory /vol/persistent-hdfs/dfs/name has been successfully formatted.
14/01/16 19:11:36 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip-172-31-17-14.ec2.internal/172.31.17.14
************************************************************/
Persistent HDFS installed, won't start by default...
~/spark-ec2
Setting up spark-standalone
RSYNC'ing /root/spark/conf to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
RSYNC'ing /root/spark-ec2 to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com: no org.apache.spark.deploy.worker.Worker to stop
ec2-54-236-239-94.compute-1.amazonaws.com: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting org.apache.spark.deploy.master.Master, logging to /root/spark/bin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-ip-172-31-17-14.ec2.internal.out
ec2-54-236-239-94.compute-1.amazonaws.com: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/bin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-172-31-24-198.ec2.internal.out
ec2-54-236-245-195.compute-1.amazonaws.com: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/bin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-172-31-24-199.ec2.internal.out
Setting up ganglia
RSYNC'ing /etc/ganglia to slaves...
ec2-54-236-239-94.compute-1.amazonaws.com
ec2-54-236-245-195.compute-1.amazonaws.com
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
Connection to ec2-54-236-239-94.compute-1.amazonaws.com closed.
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
Connection to ec2-54-236-245-195.compute-1.amazonaws.com closed.
Shutting down GANGLIA gmetad:                              [FAILED]
Starting GANGLIA gmetad:                                   [  OK  ]
Stopping httpd:                                            [FAILED]
Starting httpd:                                            [  OK  ]
Connection to ec2-54-236-251-167.compute-1.amazonaws.com closed.
Spark standalone cluster started at http://ec2-54-236-251-167.compute-1.amazonaws.com:8080
Ganglia started at http://ec2-54-236-251-167.compute-1.amazonaws.com:5080/ganglia
Done!
like image 680
LFoos24 Avatar asked Dec 02 '22 20:12

LFoos24


1 Answers

I had the same problem with Spark 0.9.1 and updated spark-ec2 script. So after successful deploy I tried to login:

./spark-ec2 -k my-key-pair -i my-key-pair.pem login MY_SPARK_CLUSTER

and this gives error:

Searching for existing cluster Spark...
ERROR: Could not find any existing cluster

The problem was that my cluster is in region eu-west-1, but default region is us-east-1. So while login you should add --region key:

./spark-ec2 -k my-key-pair -i my-key-pair.pem login MY_SPARK_CLUSTER --region=YOUR_REGION
like image 75
Dmitriy Selivanov Avatar answered Dec 21 '22 07:12

Dmitriy Selivanov