Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to automatically pull the latest commit from a git submodule on Heroku?

I have a COVID-19 reporting web app hosted on Heroku(http://www.rajcovid19.info), the data for which I get from the John Hopkins University Git Repository. I have added the repository as a submodule of my main project repository which I use to push changes to Heroku. This enables me to pull updates to the COVID-19 repository on my computer and then push those changes to Heroku. However, I am not able to pull the latest commits to the COVID-19 submodule directly to the Heroku App. I tried using GitPython but it produces an "Invalid Git Repository" error whenever I try to pull changes.

My current working solution for this problem is to make a script on my laptop which periodically checks the COVID-19 repository for changes and then pushes them to the Heroku App.

This works but requires me to open my laptop at least once each day.

Is it possible to somehow make Heroku pull the latest commits to the submodule automatically?

EDIT:

According to Heroku, the service has an "ephemeral storage": Heroku Ephemeral Storage

I think this might complicate things as well?

As for my GitPython code that didn't work, here it is:

GitPython 1

GitPython 2

That is:

# Root directory for the COVID-19 Local repository root=os.getcwd()

if os. path.isdir(root+"/COVID-19"):
  root+="/COVID-19"
  repo=Repo(root) git=repo.git git. pull
else:
  root+="/COVID-19"
  os.system("git clone https://github.com/CSSEGISandData/COVID-19.git")

This works with my computer but this gives me an "Invalid Git Repo" error on the Heroku app. I did some debugging and made sure that the path of the repository was correct on the Heroku App but it just didn't seem to help.

like image 725
Aniansh Avatar asked Apr 25 '20 15:04

Aniansh


People also ask

What is git submodule update?

The git submodule update command sets the Git repository of the submodule to that particular commit. The submodule repository tracks its own content which is nested into the main repository. The main repository refers to a commit of the nested submodule repository.

How do submodules work in git?

A git submodule is a record within a host git repository that points to a specific commit in another external repository. Submodules are very static and only track specific commits. Submodules do not track git refs or branches and are not automatically updated when the host repository is updated.

How do I deploy a git repo to Heroku?

To deploy your app to Heroku, use the git push command to push the code from your local repository's main branch to your heroku remote. For example: $ git push heroku main Initializing repository, done.


1 Answers

https://help.heroku.com/RR520244/why-don-t-git-submodules-work-with-heroku-pipelines-review-apps-or-github-sync

git submodules are not compatible with Heroku, see provided link.


You should solve this differently.

Possible approaches:

1. Write a script that periodically pulls the data and add them to your project.

git subtree pull --prefix=data --squash --message="update covid data" https://github.com/CSSEGISandData/COVID-19.git master
git push origin HEAD

git subtrees are compatible with Heroku. For this approach you need to have a VPS and add the script to cron. Cron is a powerful tool that allows you to define scripts that are run at certain time intervals periodically

2. On app startup download the zip or tar.gz, unpack it and then serve the data. You will need to create a startup.sh script that does that and the final command would be starting your program. Something like:

curl -L https://api.github.com/repos/CSSEGISandData/COVID-19/tarball > data.tar.gz
cd data && rm -r ./*
tar -xzvf ../data.tar.gz
cd ..
python main.py

I recommend the second approach. First approach is recommended if you want the data versioned.

like image 143
Tin Nguyen Avatar answered Nov 04 '22 15:11

Tin Nguyen