Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to install Apache Toree for Spark Kernel in Jupyter in (ana)conda environment?

I am trying to install Jupyter-support for Spark in a conda environment (which I set up using http://conda.pydata.org/docs/test-drive.html) of the anaconda distribution. I am trying to use apache toree as Jupyter Kernel for this.

Here is what I did after I installed Anaconda:

conda create --name jupyter python=3
source activate jupyter
conda install jupyter
pip install --pre toree
jupyter toree install

Everything worked fine until I reached the last line. There I get

PermissionError: [Errno 13] Permission denied: '/usr/local/share/jupyter'

Which begs the question: Why is it even looking in that directory? Afterall it is supposed to stay in the environment. Thus I exectue

jupyter --paths

and get

config:
    /home/user/.jupyter
    ~/anaconda2/envs/jupyter/etc/jupyter
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /home/user/.local/share/jupyter
    ~/anaconda2/envs/jupyter/share/jupyter
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /run/user/1000/jupyter

I am not quite sure what is going on and how to proceed to get everything running in (and if possible only in) the conda environment "jupyter".

like image 518
Make42 Avatar asked May 13 '16 16:05

Make42


People also ask

What is Apache toree Scala?

Apache Toree is a kernel for the Jupyter Notebook platform providing interactive access to Apache Spark. It has been developed using the IPython messaging protocol and 0MQ, and despite the protocol's name, Apache Toree currently exposes the Spark programming model in Scala, Python and R languages.

How do I write a Scala code in Jupyter notebook?

Installation Scala Kernal in Jupyter:Step 1: Launch terminal/powershell and install the spylon-kernel using pip, by running the following command. Step 2: Select the Scala kernel in the notebook, by creating a kernel spec, using the following command. Step3: Launch Jupyter notebook on Browser.


2 Answers

Jupyter tries to install kernel into systemwide kernel registry by default. You can pass a --user flag and it will use a user kernel dir. More details are available in kernelspec.py. Following command installs toree kernel into the user kernel

jupyter toree install --user
like image 190
Viktor Pishchulin Avatar answered Oct 23 '22 04:10

Viktor Pishchulin


You can use --help to see all available options:

$ jupyter toree install --help
A Jupyter kernel for talking to spark

Options
-------

Arguments that take values are actually convenience aliases to full
Configurables, whose aliases are listed on the help line. For more information
on full configurables, see '--help-all'.

--user
    Install to the per-user kernel registry
--replace
    Replace any existing kernel spec with this name.
--sys-prefix
    Install to Python's sys.prefix. Useful in conda/virtual environments.
--debug
    set log level to logging.DEBUG (maximize logging output)
--kernel_name= (ToreeInstall.kernel_name)
    Default: 'Apache Toree'
    Install the kernel spec with this name. This is also used as the base of the
    display name in jupyter.
--spark_home= (ToreeInstall.spark_home)
    Default: '/usr/local/spark'
    Specify where the spark files can be found.
--toree_opts= (ToreeInstall.toree_opts)
    Default: ''
    Specify command line arguments for Apache Toree.
--spark_opts= (ToreeInstall.spark_opts)
    Default: ''
    Specify command line arguments to proxy for spark config.
--interpreters= (ToreeInstall.interpreters)
    Default: 'Scala'
    A comma separated list of the interpreters to install. The names of the
    interpreters are case sensitive.
--python_exec= (ToreeInstall.python_exec)
    Default: 'python'
    Specify the python executable. Defaults to "python"
--log-level= (Application.log_level)
    Default: 30
    Choices: (0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL')
    Set the log level by value or name.
--config= (JupyterApp.config_file)
    Default: ''
    Full path of a config file.

To see all available configurables, use `--help-all`

Examples
--------

    jupyter toree install
    jupyter toree install --spark_home=/spark/home/dir
    jupyter toree install --spark_opts='--master=local[4]'
    jupyter toree install --kernel_name=toree_special
    jupyter toree install --toree_opts='--nosparkcontext'
    jupyter toree install --interpreters=PySpark,SQL
    jupyter toree install --python=python

Using jupyter toree install --sys-prefix is the best option for conda and venv environments.

like image 2
ostrokach Avatar answered Oct 23 '22 03:10

ostrokach