I have already created the 3 node cluster on dataproc.
Now I dont want to delet the cluster and recreate with initialization actions for jupyter installation.
Is anyone can tell me that how to install the jupyter on existing dataproc cluster ?
-Revan
Step 1: Get a Cloud Dataproc cluster up and running
In this step, you'll create a Cloud Dataproc cluster named "datascience" with Jupyter notebooks initialized and running using the command line. (Note: Please do not use Cloud Shell as you will not be able to create a socket connection from it in Step 2.)
The simplest approach is to use all default settings for your cluster. Jupyter will run on port 8123 of your master node. If you don't have defaults set, you'll be prompted at this stage to enter a zone for the cluster. As you'll be connecting to the UI on the cluster, choose zones in a region close to you.
gcloud dataproc clusters create datascience \
--initialization-actions \
gs://dataproc-initialization-actions/jupyter/jupyter.sh \
Waiting on operation [projects/------/regions/global/operations/XXX-XXX-XXX-XXX-XXX].
Waiting for cluster creation operation...done.
Created tw[https://dataproc.googleapis.com/v1/projects/------/regions/global/clusters/datascience].
(If you prefer using a graphical user interface, then the same action can be taken by following these instructions.)
Once completed, your Cloud Dataproc cluster is up and running and ready for a connection.
For the next step, you'll need to know the hostname of your Cloud Dataproc master machine as well as the zone in which your instance was created. To determine that zone, run the following command in your terminal:
gcloud dataproc clusters list
Output:
NAME WORKER_COUNT STATUS ZONE
datascience 2 RUNNING europe-west1-c
The cluster master-host-name is the name of your Cloud Dataproc cluster followed by an -m suffix. For example, if your cluster is named "my-cluster", the master-host-name would be "my-cluster-m".
Step 2: Connect to the Jupyter notebook
You'll use an ssh tunnel from your local machine to the server to connect to the notebook. Depending on your machine’s networking setup, this step can take a little while to get right, so before proceeding confirm that everything is working by accessing the YARN UI. From the browser that you launched when following the instructions in the cluster-web-interfaces cloud documentation, access the following URL.
http://datascience-m:8088/
Once you have the tunnel running, connect to the external IP of the notebook and port. The default port is 8123.
http://datascience-m:8123
For More Details Follow this google post. CLICK ME
enjoy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With