I am using DataBricks and Spark 7.4ML,
The following code successfully logs the params and metrics, and I can see the ROCcurve.png in the MLFLOW gui (just the item in the tree below the model). But the actually plot is blank. Why?
with mlflow.start_run(run_name="logistic-regression") as run:
pipeModel = pipe.fit(trainDF)
mlflow.spark.log_model(pipeModel, "model")
predTest = pipeModel.transform(testDF)
predTrain = pipeModel.transform(trainDF)
evaluator=BinaryClassificationEvaluator(labelCol="arrivedLate")
trainROC = evaluator.evaluate(predTrain)
testROC = evaluator.evaluate(predTest)
print(f"Train ROC: {trainROC}")
print(f"Test ROC: {testROC}")
mlflow.log_param("Dataset Name", "Flights " + datasetName)
mlflow.log_metric(key="Train ROC", value=trainROC)
mlflow.log_metric(key="Test ROC", value=testROC)
lrModel = pipeModel.stages[3]
trainingSummary = lrModel.summary
roc = trainingSummary.roc.toPandas()
plt.plot(roc['FPR'],roc['TPR'])
plt.ylabel('False Positive Rate')
plt.xlabel('True Positive Rate')
plt.title('ROC Curve')
plt.show()
plt.savefig("ROCcurve.png")
mlflow.log_artifact("ROCcurve.png")
plt.close()
display(predTest.select(stringCols + ["arrivedLate", "prediction"]))
What the notebook shows:
What the MLFlow shows:
Put plt.show()
after plt.savefig()
- plt.show()
will remove your plot because it is shown already.
import mlflow
import matplotlib.pyplot as plt
fig, axs = plt.subplots(2)
x0, y0 = [1,2,3], [1,2,3]
x1, y1 = [1,2,3], [1,2,3]
axs[0].plot(x0, y0)
axs[1].plot(x1, y1)
mlflow.log_figure(fig, 'my_plot.png')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With