I want to clone the Linux kernel repo, but only from version 3.0 onwards, since the kernel repo is so huge it makes my versioning tools run faster if I can do a shallow clone. The core of my question is: how can I tell git what the "n" value is for the --depth parameter? I was hoping this would work:
git clone http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git --depth v3.0
thanks.
git clone If you only need the specific tag, you can pass the --single-branch flag, which prevents fetching all the branches in the cloned repository. With the --single-branch flag, only the branch/tag specified by the --branch option is cloned. $ git clone -b <tagname> –single-branch <repository> .
Read fully for a solution, but unfortunately, git clone does not work in the fashion you are requesting. The --depth
parameter limits the number of revisions
not the number of commits
. There is not a clone parameter which limits the amount of commits. In your situation, even if you knew that there were only at most 10 revision differences from the file that has changed the most between v3.0 and the newest HEAD in the repo and used --depth 10
you could still get most or the whole repo history. Because some objects may not have as many as 10 revisions and you will get their history all the way back to the beginning of their first appearance in the repo.
Now here is how to do what you like: The key to your issue is that you need the commits between v3.0 and the recent most reference you want. Here are the steps I did to do just that:
git clone http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git --depth 10075 smaller_kernel_repo
cd smaller_kerenel_repo
git log --oneline v3.0^..v3.0
echo "02f8c6aee8df3cdc935e9bdd4f2d020306035dbe" > .git/info/grafts
To get around some issues with some kernel log entries do: export GIT_AUTHOR_NAME="tmp"
and export GIT_COMMITTER_NAME="tmp"
There is a nice warning about in the man page about git filter-branch
rewriting history by following graft points... so lets abuse that, now run git filter-branch
and sit back and wait...(and wait and wait)
Now you need to clean up everything:
git reflog expire --expire=now --all git repack -ad # Remove dangling objects from packfiles git prune # Remove dangling loose objects
This process is time consuming but not very complex. Hopefully it will save you all the time you were hoping for in the long run. At this point you will have is essentially a repo with an amended history of only v3.0 onwards from the linux-stable.git repo. Just like if used the --depth
on clone you have the same restrictions on the repo and would only be able to modify and send patches from the history you already have. There are ways around that.. but it deserves its own Q&A.
I am in the process of testing out the last few steps myself, but the git filter-branch
operation is still going. I'll update this post with any issues, but I'll go ahead and post it so you can start on this process if you find it acceptable.
UPDATE
Workaround for issue (fatal: empty ident <> not allowed). This issue stems with a problem in the commit history of the linux repo.
Change the git filter-branch
command to:
git filter-branch --commit-filter ' if [ "$GIT_AUTHOR_EMAIL" = "" ]; then GIT_AUTHOR_EMAIL="tmp@tmp"; GIT_AUTHOR_NAME='tmp' GIT_COMMITTER_NAME='Me' GIT_COMMITTER_EMAIL='[email protected]' git commit-tree "$@"; else git commit-tree "$@"; fi '
How about cloning the tag to a depth of 1?
git clone --branch mytag0.1 --depth 1 https://example.com/my/repo.git
Notes:
--depth 1
implies --single-branch
, so no info from other branches is brought to the cloned repositoryfile://
instead of only the repository pathIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With