Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to clone a list of GIT repositories?

Tags:

git

git-clone

I have a list of 70+ GIT repo URLs(students). Is there any feature that allows me to clone them all at once?

Would there be the same for synchronizing the repository with the server?

If not, I guess I'd need to write a quick shell script in order to do this.

like image 871
Kevin Van Ryckegem Avatar asked Nov 11 '15 11:11

Kevin Van Ryckegem


2 Answers

Shell scripting.

Getting the repos

The principal idea to get the repos is

while read repo; do
    git clone "$repo"
done < repolist.txt

assuming the file "repolist.txt" contains one repo URL per line.

Updating the repos

This one is trickier.

While it's easy to iterate over the list of repos, there's the conceptual problem with "synchronizing". Its essense roots in that when you clone the "normal" way — that is, not specifying different funky command-line options modifying the git clone's defaults — all the branches of the source repo end up being created in the form of the so-called "remote branches" in your resulting local repo. Those remote branches merely track the state of the matching branches in the source repo. A single branch, designated as the "current" in the source repo, is then taken, and a local (that is, yours only) branch is created out of it. That's why when you clone a repo with 100 branches you end up having only a single local branch (which is "master" in 99.9% cases).

What follows, is that automatic "synchronization" is a moot point here: when you do git fetch origin in a "normally" cloned repo, the remote branches get updated with their new contents and are hence almost1 fully synchronized. Note that your local branches are not touched at all. That's because you might have your local work done on them, and so you have to decide on how do you want to reconcile the updated state of the remote branches with your local branches, if at all. This is just the default work model assumed by Git because that's what needed in most cases.

If, instead, you don't intend to do any work on the branches of those repos, and they are for inspection only, the easiest approach is to make Git have no remote branches at all.

To do this, you clone using several explicit steps:

  1. Initialize an empty repository:

    git init <dirname>
    
  2. Configure a remote there:

    git remote add --mirror=fetch origin <url>
    

    The --mirror=fetch tells Git to setup the mapping of what to fetch to what to update with the fetched data in a way which forcefully overwrites all local stuff with the remote stuff.

  3. Fetch all the data — overwriting everything local:

    git fetch -u origin
    

    The -u (or --update-head-ok) permits Git to overwrite the branch pointed to by the HEAD reference. This pulls the rug from the feet of the index and the work tree but we'll compensate for that on the next step.

  4. Force-update the index and the work tree using the new data:

    git reset --hard HEAD
    

    This makes Git overwrite the index and the work tree with the up-to-date state of the branch pointed at by HEAD — typically "master" but should you check another branch out (see below) it will obviously use that one.

Then, to update the data next time you do:

git fetch -u origin
git reset --hard HEAD

and then study what's in the work tree.

If you need to view another branch, the usual

git branch -a

…observe the list and pick a branch, then

git checkout <that_branch>

will work.

In essense, all this dance with explicit repo initialization and adding of a remote in a special way is needed because the --mirror option of git clone implies creating a bare repository, and we supposedly want a normal one (I think).

To update all the repos located in a directory, do

find "$root_dir" -mindepth 1 -maxdepth 1 -type d -print \
    | while read repo; do \
        cd "$repo" && \
        git fetch -u origin && \
        git reset --hard HEAD \
      done

1 The branches deleted in the remote repo are not deleted locally. To do that, you have to run git remote prune origin.

like image 146
kostix Avatar answered Sep 21 '22 09:09

kostix


You could create a "super project" that includes all mentioned Git repos as submodules (also see the Git SCM book), or use a tool like repo that uses a manifest to manage all repositories.

like image 21
sschuberth Avatar answered Sep 22 '22 09:09

sschuberth