Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Artifact storage and MLFLow on remote server

Tags:

python

mlflow

I am trying to get MLFlow on another machine in a local network to run and I would like to ask for some help because I don't know what to do now.

I have a mlflow server running on a server. The mlflow server is running under my user on the server and has been started like this:

mlflow server --host 0.0.0.0 --port 9999 --default-artifact-root sftp://<MYUSERNAME>@<SERVER>:<PATH/TO/DIRECTORY/WHICH/EXISTS>

My program which should log all the data to the mlflow server looks like this:

from mlflow import log_metric, log_param, log_artifact, set_tracking_uri

if __name__ == "__main__":
    remote_server_uri = '<SERVER>' # this value has been replaced
    set_tracking_uri(remote_server_uri)
    # Log a parameter (key-value pair)
    log_param("param1", 5)

    # Log a metric; metrics can be updated throughout the run
    log_metric("foo", 1)
    log_metric("foo", 2)
    log_metric("foo", 3)

    # Log an artifact (output file)
    with open("output.txt", "w") as f:
        f.write("Hello world!")
    log_artifact("output.txt")

The parameters get and metrics get transfered to the server but not the artifacts. Why is that so?

Note on the SFTP part: I can log in via SFTP and the pysftp package is installed

like image 615
Spark Monkay Avatar asked Nov 22 '19 13:11

Spark Monkay


People also ask

Where is MLflow data stored?

Where Runs Are Recorded. MLflow runs can be recorded to local files, to a SQLAlchemy compatible database, or remotely to a tracking server. By default, the MLflow Python API logs runs locally to files in an mlruns directory wherever you ran your program.

How do you save artifacts in MLflow?

Replace <local-path-to-store-artifacts> with the local path where you want to store the artifacts. Replace <run-id> with the run_id of your specified MLflow run. After the artifacts have been downloaded to local storage, you can copy (or move) them to an external filesystem or a mount point using standard tools.

Can MLflow be used in production?

MLflow is an open-source platform for machine learning lifecycle management. Recently, I set up MLflow in production with a Postgres database as a Tracking Server and SFTP for the transfer of artifacts over the network.


2 Answers

I guess your problem is that you need to create also the experiment so using the sftp remote storage

mlflow.create_experiment("my_experiment", artifact_location=sftp_uri)

This fixed it for me.

like image 87
sklingel Avatar answered Sep 20 '22 14:09

sklingel


I don't know if I will get an answer to my problem but I did solved it this way.

On the server I created the directory /var/mlruns. I pass this directory to mlflow via --backend-store-uri file:///var/mlruns

Then I mount this directory via e.g. sshfs on my local machine under the same path.

I don't like this solution but it solved the problem good enough for now.

like image 31
Spark Monkay Avatar answered Sep 22 '22 14:09

Spark Monkay