Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested runs using MLflowClient

Tags:

python

mlflow

In mlflow, you can run nested runs using the fluent projects API which are collapsable in the UI. E.g. by using the following code (see this for UI support):

with mlflow.start_run(nested=True):
  mlflow.log_param("mse", 0.10)
  mlflow.log_param("lr", 0.05)
  mlflow.log_param("batch_size", 512)
  with mlflow.start_run(nested=True):
    mlflow.log_param("max_runs", 32)
    mlflow.log_param("epochs", 20)
    mlflow.log_metric("acc", 98)
    mlflow.log_metric("rmse", 98)
  mlflow.end_run()

Due to database connection issues, I want to use a single mlflow client across my application.

How can I stack runs, e.g. for hyperparameter optimization, using created runs via MlflowClient().create_run()?

like image 772
Julian L. Avatar asked Nov 07 '22 16:11

Julian L.


1 Answers

It is a bit complicated to achieve, but I found a way by looking into the Fluent Tracking Interface that is used when you directly use the mlflow import.

In the start_run function you can see that a nested_run is just defined by setting a specific tag mlflow.utils.mlflow_tags.MLFLOW_PARENT_RUN_ID. Just set this to the run.info.run_id value of your parent run and it will be shown correctly in the UI.

Here is an example:

from mlflow.tracking import MlflowClient
from mlflow.utils.mlflow_tags import MLFLOW_PARENT_RUN_ID

client = MlflowClient()
try:
    experiment = client.create_experiment("test_nested")
except:
    experiment = client.get_experiment_by_name("test_nested").experiment_id
parent_run = client.create_run(experiment_id=experiment)
client.log_param(parent_run.info.run_id, "who", "parent")

child_run_1 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_1.info.run_id, "who", "child 1")

child_run_2 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_2.info.run_id, "who", "child 2")

In case you're wondering: The run name can also be specified that way, using the mlflow.utils.mlflow_tags.MLFLOW_RUN_NAME tag.

like image 55
Simon Hessner Avatar answered Nov 15 '22 12:11

Simon Hessner