Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DVC checkout without Git

Tags:

git

dvc

I am using DVC for data version control in machine learning projects. Typically, switching between versions of data is managed to done by checkout git branches, commits, or tags to get appropriate *.dvc files that represent data checksum, then run dvc checkout to update data, for example:

git checkout ddc5c395b2afb2b2a626c62ef63a2c7d85382aa6 # to rollback to an old version of *.dvc files
dvc checkout mydata.dvc # to roll `mydata` back to the previous version 

I now want to use DVC and switch between data versions without using git, what i am expecting is somethings like following:

dvc checkout mydata.dvc --tag v1.0

Could someone please guide me to use dvc in such a way? Thank you for any help.

like image 712
TaQuangTu Avatar asked Feb 10 '26 12:02

TaQuangTu


1 Answers

To follow up on @omessor's comment, there are Python packages that allow you to programmatically work with a git repo (without using CLI git). DVC itself uses both dulwich and pygit2 via scmrepo.

You could actually do what you are looking for directly through DVC's internal API like

from dvc.repo import Repo

dvc = Repo("path/to/your/repo")
dvc.scm.checkout("tags/v1.0")  # git checkout tags/v1.0
dvc.checkout("mydata.dvc")  # dvc checkout mydata.dvc

This would only require installing DVC via pip or conda, and does not require a CLI git installation.

Just note that these API's aren't publicly documented, so you may need to take a look at the DVC and scmrepo source to see how it works

https://github.com/iterative/dvc/blob/main/dvc/scm.py

like image 94
pmrowla Avatar answered Feb 13 '26 18:02

pmrowla



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!