How to get iPython inbuild magic command to work in Jupyter notebook Pyspark kernel?

Tags:

I am using PySpark kernel installed through Apache Toree in Jupyter Notebook using Anaconda v4.0.0 (Python 2.7.11). After getting a table from Hive, use matplotlib/panda to plot some graph in Jupyter notebook, following the tutorial as below:

%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Set some Pandas options
pd.set_option('display.notebook_repr_html', False)
pd.set_option('display.max_columns', 20)
pd.set_option('display.max_rows', 25)

normals = pd.Series(np.random.normal(size=10))
normals.plot()

I was stuck at the first link when I tried to use %matplotlib inline which shows

Name: Error parsing magics!
Message: Magics [matplotlib] do not exist!
StackTrace:

Looking at Toree Magic and MagicManager, I realised that %matplotlib is calling MagicManager instead of the iPython in-build magic command.

Is it possible for Apache Toree - PySpark to use iPython in-build magic command instead?

360

asked Sep 19 '16 09:09

Angletear

1 Answers

I did a workaround hack for PySpark and magic command to work, instead of installing Toree PySpark kernel I am using PySpark directly on Jupyter Notebook.

Download and install Anaconda2 4.0.0
Download Spark 1.6.0 pre-built for Hadoop 2.6
Append ~/.bashrc with the following commands and enter source ~/.bashrc to update environment variables

# added to run spark
export PATH="{your_spark_dir}spark/sbin:$PATH"
export PATH="{your_spark_dir}spark/bin:$PATH"

# added to launch spark application in cluster mode
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre

# next 2 lines are optional, needed only Spark Cluster export HADOOP_CONF_DIR={your_hadoop_conf}/hadoop-conf
export YARN_CONF_DIR={your_hadoop_conf}/hadoop-conf

# added by Anaconda2 4.0.0 installer
export PATH="{your_anaconda_dir}/Anaconda/bin:$PATH"

# added to run pyspark in jupyter notebook
export PYSPARK_DRIVER_PYTHON={your_anaconda_dir}/Anaconda/bin/jupyter
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --NotebookApp.open_browser=False --NotebookApp.ip='0.0.0.0' --NotebookApp.port=8888"
export PYSPARK_PYTHON={your_anaconda_dir}/Anaconda/bin/python

Running the Jupyter Notebook

pyspark --master=yarn --deploy-mode=client to start the notebook running PySpark in cluster mode
Open a browser and enter IP_ADDRESS_OF_COMPUTER:8888

Disclaimer
This is only a workaround and not an actual way of fixing the problem please let me know if you found a way for Toree PySpark ipython inbuild magic command to work. Magic command such as %matplotlib notebook

164

answered Jan 02 '23 12:01

Angletear

Related questions
                            
                                Pyplot - rescaling y axis after limiting x axis
                            
                                Getting matplotlib plots to refresh on mouse focus
                            
                                matplotlib.animation error - The system cannot find the file specified
                            
                                Adjust one subplot's height in absolute way (not relative) in Matplotlib
                            
                                Programmatically grow a figure in matplotlib
                            
                                Wrong Tracker values on a 3D histogram
                            
                                Python: Add a Ring Sector or a Wedge to a Polar Plot
                            
                                exit on KeyboardInterrupt after generating plots in while loop
                            
                                Double linebreak fails with matplotlib and xkcd style
                            
                                Matplotlib set minor ticks by default "ON"
                            
                                Changing the bit-depth of figures produced using Matplotlib
                            
                                How can I convert from scatter size to data coordinates in matplotlib?
                            
                                Matplotlib TypeError when importing matplotlib.pyplot
                            
                                Matplotlib with multiprocessing freeze computer
                            
                                Plotting a dictionary of DataFrames
                            
                                How to plot certain rows of a pandas dataframe
                            
                                python: stretch world map
                            
                                Interactive plot in Jupyter notebook
                            
                                Ipython/ pylab/ matplotlib installation and initialization error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to get iPython inbuild magic command to work in Jupyter notebook Pyspark kernel?

Tags:

matplotlib

ipython

jupyter

pyspark

apache-toree

Angletear

People also ask

1 Answers

Angletear

Recent Activity

Donate For Us