Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ways to Plot Spark Dataframe without Converting it to Pandas

Is there any way to plot information from Spark dataframe without converting the dataframe to pandas?

Did some online research but can't seem to find a way. I need to automatically save these plots as .pdf, so using the built-in visualization tool from databricks would not work.

Right now, this is what I'm doing (as an example):

# df = some Spark data frame 
df = df.toPandas()
df.plot()
display(plt.show())

I want to produce line graphs, histograms, bar charts and scatter plots without converting my dataframe to pandas dataframe. Thank you!

like image 797
KikiNeko Avatar asked Jul 29 '19 20:07

KikiNeko


1 Answers

The display function is only available in databricks kernel notebook, not in spark

like image 173
Gravity Avatar answered Sep 23 '22 07:09

Gravity