I have a virtual machine which has Spark 1.3
on it but I want to upgrade it to Spark 1.5
primarily due certain supported functionalities which were not in 1.3. Is it possible I can upgrade the Spark
version from 1.3
to 1.5
and if yes then how can I do that?
Pre-built Spark distributions, like the one I believe you are using based on another question of yours, are rather straightforward to "upgrade", since Spark is not actually "installed". Actually, all you have to do is:
spark-1.3.1-bin-hadoop2.6
already is)SPARK_HOME
(and possibly some other environment variables depending on your setup) accordinglyHere is what I just did myself, to go from 1.3.1 to 1.5.2, in a setting similar to yours (vagrant VM running Ubuntu):
1) Download the tar file in the appropriate directory
vagrant@sparkvm2:~$ cd $SPARK_HOME
vagrant@sparkvm2:/usr/local/bin/spark-1.3.1-bin-hadoop2.6$ cd ..
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema
ipcontroller ipengine2 ipython pygmentize
vagrant@sparkvm2:/usr/local/bin$ sudo wget http://apache.tsl.gr/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz
[...]
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6.tgz
ipcontroller ipengine2 ipython pygmentize
Notice that the exact mirror you should use with wget
will be probably different than mine, depending on your location; you will get this by clicking the "Download Spark" link in the download page, after you have selected the package type to download.
2) Unpack the tgz
file with
vagrant@sparkvm2:/usr/local/bin$ sudo tar -xzf spark-1.*.tgz
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6
ipcontroller ipengine2 ipython pygmentize spark-1.5.2-bin-hadoop2.6.tgz
You can see that now you have a new folder, spark-1.5.2-bin-hadoop2.6
.
3) Update accordingly SPARK_HOME
(and possibly other environment variables you are using) to point to this new directory instead of the previous one.
And you should be done, after restarting your machine.
Notice that:
sudo
was necessary in my case; it may be unnecessary for you depending on your settings.tgz
file.tgz
files have been deleted, or modify the tar
command above to point to a specific file (i.e. no *
wildcards as above).SPARK_HOME
to /opt/spark
spark-2.2.1-bin-hadoop2.7.tgz
- can use wget
ln -s /opt/spark-2.2.1 /opt/spark
$SPARK_HOME/conf
accordinglyFor every new version you download just create the symlink to it (step 3)
ln -s /opt/spark-x.x.x /opt/spark
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With