I have a virtual machine which has Spark 1.3 on it but I want to upgrade it to Spark 1.5 primarily due certain supported functionalities which were not in 1.3. Is it possible I can upgrade the Spark version from 1.3 to 1.5 and if yes then how can I do that?
Pre-built Spark distributions, like the one I believe you are using based on another question of yours, are rather straightforward to "upgrade", since Spark is not actually "installed". Actually, all you have to do is:
spark-1.3.1-bin-hadoop2.6 already is)SPARK_HOME (and possibly some other environment variables depending on your setup) accordinglyHere is what I just did myself, to go from 1.3.1 to 1.5.2, in a setting similar to yours (vagrant VM running Ubuntu):
1) Download the tar file in the appropriate directory
vagrant@sparkvm2:~$ cd $SPARK_HOME
vagrant@sparkvm2:/usr/local/bin/spark-1.3.1-bin-hadoop2.6$ cd ..
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema
ipcontroller ipengine2 ipython pygmentize
vagrant@sparkvm2:/usr/local/bin$ sudo wget http://apache.tsl.gr/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz
[...]
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6.tgz
ipcontroller ipengine2 ipython pygmentize
Notice that the exact mirror you should use with wget will be probably different than mine, depending on your location; you will get this by clicking the "Download Spark" link in the download page, after you have selected the package type to download.
2) Unpack the tgz file with
vagrant@sparkvm2:/usr/local/bin$ sudo tar -xzf spark-1.*.tgz
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6
ipcontroller ipengine2 ipython pygmentize spark-1.5.2-bin-hadoop2.6.tgz
You can see that now you have a new folder, spark-1.5.2-bin-hadoop2.6.
3) Update accordingly SPARK_HOME (and possibly other environment variables you are using) to point to this new directory instead of the previous one.
And you should be done, after restarting your machine.
Notice that:
sudo was necessary in my case; it may be unnecessary for you depending on your settings.tgz file.tgz files have been deleted, or modify the tar command above to point to a specific file (i.e. no * wildcards as above).SPARK_HOME to /opt/spark
spark-2.2.1-bin-hadoop2.7.tgz - can use wget
ln -s /opt/spark-2.2.1 /opt/spark
$SPARK_HOME/conf accordinglyFor every new version you download just create the symlink to it (step 3)
ln -s /opt/spark-x.x.x /opt/sparkIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With