Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Push and pull my conda environment using git

I have a git repo with my project. I change my conda environment quite frequently, so I want my repo to track changes in the environment, and to be able to push the most recent one and pull it in another computer. Is it possible? I search and find several solutions (e.g. https://tdhopper.com/blog/my-python-environment-workflow-with-conda/) but none provide an automatic changes-tracking.

Meaning, I want to include any changes I make in my environment into the project's repository. Like adding new packages etc. So that when I git pull it in another computer, the new package will be also pulled and added to the environment.

like image 784
okuoub Avatar asked Aug 27 '18 11:08

okuoub


People also ask

Can I use git in Anaconda?

Installation of git in conda Step 1: Click here to download the latest version of Anaconda. Step 3: Verify the installation. Step 4: Finally, install git from the anaconda channel.


2 Answers

I use git hooks to make conda environment updates automatic. You can have more information on git hooks here.

The idea here is to have two git hooks:

  • One which detects if a change in your local conda environment occured and if so, create a new commit with the updated env.yml file (I chose a pre-push hook for this one).
  • One which detects a change in env.yml file after a pull (i.e. the remote env.yml was different than the local one and was merged, I chose a post-merge hook for this one)

As described in the documentation, when a git repository is initiated, a folder .git/hooks is created and filled with example scripts. To use one of them, you only have to edit the file, rename it to remove its extension (.sample) and make sure it is executable.

NOTE: I use zsh as shell but the script should be the same in bash (please comment if not), you would just need to change the shebang line.


pre-push hook

  • Rewrite the pre-push.sample file already present in .git/hooks (replace <ENV_NAME> by the name of your conda environment):
#!/usr/bin/env zsh

echo "\n==================== pre-push hook ===================="

# Export conda environment to yaml file
conda env export -n <ENV_NAME> > env.yml

# Check if new environment file is different from original 
git diff --exit-code --quiet env.yml 

# If new environment file is different, commit it
if [[ $? -eq 0 ]]; then
    echo "Conda environment not changed. No additional commit."
else
    echo "Conda environment changed. Commiting new env.yml"
    git add env.yml
    git commit -m "Updating conda environment"
    echo 'You need to push again to push additional "Updating conda environment" commit.'
    exit 1
fi
  • Remove its extension .sample and make it executable if necessary (chmod u+x pre-push)

post-merge hook

  • There were no post-merge.sample hook in my .git/hooks folder, so I created a file post-merge and used this gist https://gist.github.com/sindresorhus/7996717 as template:
#!/usr/bin/env zsh

echo "\n==================== post-merge hook ===================="

changed_files="$(git diff-tree -r --name-only --no-commit-id ORIG_HEAD HEAD)"

check_run() {
    echo "$changed_files" | grep --quiet "$1" && eval "$2"
}

echo "Have to update the conda environment"
check_run env.yml "conda env update --file env.yml"
  • And make it executable (chmod u+x post-merge)

What will happen now ?

  • When pushing, if the conda environment changed, a message will show that you have to push again to push the commit with the updated env.yml
  • When pulling, if the pulled env.yml differs from the local env.yml, conda will update the local environment with the newly pulled env.yml.

Limitations

  • In case the environment changed locally, you can see that the updated env.yml is not automatically pushed to remote. I took the advice from this post git commit in pre-push hook.
  • Currently the updating of the conda environment after pull is using a post-merge hook. I don't know how this will be handled in case of rebase for example.
  • No git expert here, maybe there is hooks better suited for these tasks.
  • I noticed a prefix section in the env.yml which give the path to your environment folder on your local machine. After some test, everything seems to run fine but I don't know if this could somehow create conflicts when developing on various machines.

So ... comments, corrections and ideas of improvements are more than welcome !

like image 55
khourhin Avatar answered Oct 20 '22 17:10

khourhin


In Conda you can create a virtual environment from and export an environment to a file, which can be included in your git repo. If you pull down your repo on a different machine or delete your environment you can run:

conda env create -f=env.yml

When you make changes to your environment, run an export before you add/commit:

conda env export > env.yml
like image 9
MarshHawk Avatar answered Oct 20 '22 15:10

MarshHawk