Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I install Jupyter/iPython on Dataproc?

I want to use Jupyter/iPython on Cloud Dataproc. How can I automatically install and configure it when I create new clusters?

like image 716
James Avatar asked Dec 28 '25 18:12

James


1 Answers

The Cloud Dataproc team has a GitHub repository of initialization actions containing sample and often-used initialization actions. There is specifically one for iPython in the repository you can use to automatically install and configure iPython. The initialization action page has more details on how to use the scripts when creating a new cluster.

The tl;dr process:

  1. Download the initialization action for iPython
  2. Save the initialization action into a Google Cloud Storage bucket
  3. Create a new cluster with the Google Cloud SDK using the --initalization-actions flag:

    gcloud beta dataproc clusters create <my-dataproc-cluster> --initialization-actions gs://<my-bucket>/ipython.sh

  4. Create an SSL tunnel and SOCKS proxy to the cluster

  5. Open a web browser to the master node http://<my-dataproc-cluster>-m:8123

In the example above you need to replace <my-bucket> with the name of your Cloud Storage bucket and <my-dataproc-cluster> with the name of your cluster. Also note that for step #5 the URL should add a -m to the name of your cluster so you access your master node.

like image 96
James Avatar answered Dec 30 '25 22:12

James



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!