Display PySpark Dataframe as HTML Table in Juypyter Notebook

Question

I'm trying to display a PySpark dataframe as an HTML table in a Jupyter Notebook, but all methods seem to be failing.

Using this method displays a text-formatted table:

import pandas
df.toPandas()

Using this method displays the HTML table as a string:

df.toPandas().to_html()

This prints the non-resolved HTML prettier, but it doesn't resolve into a table:

print(df.toPandas().to_html())

And, all of these

from IPython.display import display, HTML

HTML(df.toPandas().to_html())
print(HTML(df.toPandas().to_html()))
display(HTML(df.toPandas().to_html()))

Simply print this object description:

<IPython.core.display.HTML object>

Any other ideas I can try?

mkirzon · Accepted Answer

I ran into this issue using PySpark kernels within JupyterLab notebooks on AWS EMR clusters. I found that the sparkmagic command %%display solved the issue. For instance, my Jupyter cell would look like -

%%display
some_spark_df

Also worth pointing out that this errored if there were empty lines between the %%display and the variable.

However I'm not sure how to do the same with a pandas dataframe. That still returns the object description when using the PySpark kernel (as oppose to a pure Python3 kernel)

Travis Pfrommer · Answer

so df.toPandas() really renders the dataframe as a html object, but my assumption is that you are looking for something else or are trying to get ride of the ellipses (...).

you can config pandas before to get ride of those, this is what i use to get ride of truncation at the column,row and field levels;

pd.set_option('display.max_colwidth', -1)
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns',500)

Also you can use the method above but you are a little out of order, here is a quick little udf that i use;

from IPython.display import display, HTML
from pyspark.sql.functions import *

def printDf(sprkDF,records): 
    return HTML(sprkDF.limit(records).toPandas().to_html())

#printDf(df,10)

hope this helps.

Display PySpark Dataframe as HTML Table in Juypyter Notebook

Tags:

python

pandas

jupyter-notebook

pyspark

nxl4

2 Answers

mkirzon

Travis Pfrommer

Recent Activity

Donate For Us

Display PySpark Dataframe as HTML Table in Juypyter Notebook

Tags:

python

pandas

jupyter-notebook

pyspark

nxl4

2 Answers

mkirzon

Travis Pfrommer

Related questions

Recent Activity

Donate For Us