Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it necessary to commit DVC files from our CI pipelines? [closed]

DVC uses git commits to save the experiments and navigate between experiments.

Is it possible to avoid making auto-commits in CI/CD (to save data artifacts after dvc repro in CI/CD side).

like image 849
B.P.Puneeth Pai Avatar asked Apr 16 '20 07:04

B.P.Puneeth Pai


People also ask

When should I use DVC commit?

You use dvc commit when an already tracked file changes. If you make a local change to the data, then you would commit the change to the cache before uploading it to remote. You haven't changed your data since it was added, so you can skip the commit step.

What is DVC yaml?

A dvc. yaml file is generated. It includes information about the command we want to run ( python src/prepare.py data/data. xml ), its dependencies, and outputs. DVC uses these metafiles to track the data used and produced by the stage, so there's no need to use dvc add on data/prepared manually.


1 Answers

will you make it part of CI pipeline

DVC often serves as a part of MLOps infrastructure. There is a popular blog post about CI/CD for ML where DVC is used under the hood. Another example but with GitLab CI/CD.

scenario where you will integrate dvc commit command with CI pipelines?

If you mean git commit of DVC files (not dvc commit) then yes, you need to commit dvc-files into Git during CI/CD process. Auto-commit is not the best practice.

How to avoid Git commit in CI/CD:

  1. After ML model training in CI/CD, save changed dvc-files in external storage (for example GitLab artifact/releases), then get the files to a developer machine and commit there. Users usually write scripts to automate it.
  2. Wait for DVC 1.0 release when run-cache (like build-cache) will be implemented. Run-cache makes dvc-files ephemeral and no additional Git commits will be required. Technically, run-cache is an associative storage repo state --> run results outside of Git repo (in data remote).

Disclaimer: I'm one of the creators of DVC.

like image 133
Dmitry Petrov Avatar answered Oct 08 '22 07:10

Dmitry Petrov