Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I make matplotlib work in AWS EMR Jupyter notebook?

Tags:

This is very close to this question, but I have added a few details specific to my question:

Matplotlib Plotting using AWS-EMR jupyter notebook

I would like to find a way to use matplotlib inside my Jupyter notebook. Here is the code-snippet in error, it's fairly simple:

notebook

import matplotlib matplotlib.use("agg") import matplotlib.pyplot as plt plt.plot([1,2,3,4]) plt.show() 

I chose this snippet because this line alone fails as it tries to use TKinter (which is not installed on an AWS EMR cluster):

import matplotlib.pyplot as plt 

When I run the full notebook snippet, the result is no runtime error but also nothing happens (no graph is shown.) My understanding on one way this can work is by adding either of the following snips:

pyspark magic notation

%matplotlib inline 

results

unknown magic command 'matplotlib' UnknownMagic: unknown magic command 'matplotlib' 

IPython explicit magic call

from IPython import get_ipython get_ipython().run_line_magic('matplotlib', 'inline') 

results

'NoneType' object has no attribute 'run_line_magic' Traceback (most recent call last): AttributeError: 'NoneType' object has no attribute 'run_line_magic'  

to my notebook which invokes a spark magic command which inlines matplotlib plots (at least that's my interpretation.) I have tried both of these after using a bootstrap action:

EMR bootstrap

sudo pip install matplotlib sudo pip install ipython 

Even with these added, I still get an error that there is no magic for matplotlib. So my question is definitely:

Question

How do I make matplotlib work in an AWS EMR Jupyter notebook?

(Or how do I view graphs and plot images in AWS EMR Jupyter notebook?)

like image 448
Matt Avatar asked May 22 '19 21:05

Matt


People also ask

How do I run matplotlib in Jupyter?

Install Matplotlib Make sure you first have Jupyter notebook installed, then we can add Matplotlib to our virtual environment. To do so, navigate to the command prompt and type pip install matplotlib. Now launch your Jupyter notebook by simply typing jupyter notebook at the command prompt.

How do I install Python EMR packages?

The most straightforward way would be to create a bash script containing your installation commands, copy it to S3, and set a bootstrap action from the console to point to your script. this will install the packages on one of the nodes in the EMR cluster.

What is the use of matplotlib inline in jupyter notebook?

Why matplotlib inline is used. You can use the magic function %matplotlib inline to enable the inline plotting, where the plots/graphs will be displayed just below the cell where your plotting commands are written. It provides interactivity with the backend in the frontends like the jupyter notebook.


2 Answers

As you mentioned, matplotlib is not installed on the EMR cluster, therefore such error will occur:

error

However, it is actually available in the managed Jupyter notebook instance (the docker container). Using the %%local magic will allow you to run the cell locally:

local

like image 63
Foxan Ng Avatar answered Sep 16 '22 12:09

Foxan Ng


The answer by @00schneider actually works.

import matplotlib.pyplot as plt  # plot data here plt.show() 

after

plt.show()

re-run the magic cell that contains the below, and you will see a plot on your AWS EMR Jupyter PySpark notebook

%matplot plt 
like image 21
Madaditya Avatar answered Sep 19 '22 12:09

Madaditya