I'm writing Python code on Databricks to process some data and output graphs. I want to be able to save these graphs as a picture file (.png or something, the format doesn't really matter) to DBFS.
Code:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'fruits':['apple','banana'], 'count': [1,2]})
plt.close()
df.set_index('fruits',inplace = True)
df.plot.bar()
# plt.show()
Things that I tried:
plt.savefig("/FileStore/my-file.png")
[Errno 2] No such file or directory: '/FileStore/my-file.png'
fig = plt.gcf()
dbutils.fs.put("/dbfs/FileStore/my-file.png", fig)
TypeError: has the wrong type - (,) is expected.
After some research, I think the fs.put only works if you want to save text files.
running the above code with plt.show()
will get you a bar graph - I want to be able to save the bar graph as an image to DBFS. Any help is appreciated, thanks in advance!
Easier way, just with matplotlib.pyplot. Fix the dbfs path:
Example
import matplotlib.pyplot as plt
plt.scatter(x=[1,2,3], y=[2,4,3])
plt.savefig('/dbfs/FileStore/figure.png')
You can do this by saving the figure to memory and then using the Python local file APIs to write to the DataBricks filesystem (DBFS).
Example:
import matplotlib.pyplot as plt
from io import BytesIO
# Create a plt or fig, then:
buf = BytesIO()
plt.savefig(buf, format='png')
path = '/dbfs/databricks/path/to/file.png'
# Make sure to open the file in bytes mode
with open(path, 'wb') as f:
# You can also use Bytes.IO.seek(0) then BytesIO.read()
f.write(buf.getvalue())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With