I'm wondering about what git is doing when it pushes up changes, and why it seems to occasionally push way more data than the changes I've made. I made some changes to two files that added around 100 lines of code - less than 2k of text, I'd imagine.
When I went to push that data up to origin, git turned that into over 47mb of data:
git push -u origin foo
Counting objects: 9195, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6624/6624), done.
Writing objects: 100% (9195/9195), 47.08 MiB | 1.15 MiB/s, done.
Total 9195 (delta 5411), reused 6059 (delta 2357)
remote: Analyzing objects... (9195/9195) (50599 ms)
remote: Storing packfile... done (5560 ms)
remote: Storing index... done (15597 ms)
To <<redacted>>
* [new branch] foo -> foo
Branch foo set up to track remote branch foo from origin.
When I diff my changes, (origin/master..HEAD) only the two files and one commit I did show up. Where did the 47mb of data come from?
I saw this: When I do "git push", what do the statistics mean? (Total, delta, etc.) and this: Predict how much data will be pushed in a git push but that didn't really tell me what's going on... Why would the pack / bundle be huge?
Try using the --verbose option to see what actually happens. Even if you did small changes, some internal things might cause git to push a lot more data. Have a look at git gc. It cleans up your local repository and might speed up things, depending on you issue.
If your latest commit is not a branch head, you may get this error. To fix this, follow the steps below. To save your files, use the git stash command. Then look at the log and get the SHA-1 of the latest commit.
Pushing with git push from the terminal takes less than 5 seconds. Actual behavior: Pushing in Atom takes several minutes.
The --force option for git push allows you to override this rule: the commit history on the remote will be forcefully overwritten with your own local history. This is a rather dangerous process, because it's very easy to overwrite (and thereby lose) commits from your colleagues.
I just realized that there is very realistic scenario which can result in unusually big push.
What objects push does send? Which do not yet exist on server. Or, rather which it did not detect as existing. How does it check object existence? In the beginning of push, server sends references (branches and tags) which is has. So, for example, if they have following commits:
CLIENT SERVER
(foo) -----------> aaaaa1
|
(origin/master) -> aaaaa0 (master) -> aaaaa0
| |
... ...
Then client will get the something like /refs/heads/master aaaaa0
, and find that it has to send only what is new in commit aaaaa1
.
But, if somebody has pushed anything to remote master, it is different:
CLIENT SERVER
(foo) -----------> aaaaa1 (master) --> aaaaa2
| /
(origin/master) -> aaaaa0 aaaaa0
| |
... ...
Here, client gets refs/heads/master aaaaa2
, but it does not know anything about aaaaa2, so it cannot deduce that aaaaa0
exists on the server. So, in this simple case of only 2 branches the whole history will be sent instead of only incremental one.
This is unlikely to happen in grown up, being actively developed, project, which has tags and many branches some of which become stale and are not updated. So users might be sending a bit more, but it does not become that big difference as in your case, and goes unspotted. But in very small teams it can happen more often and the difference would be significant.
To avoid it, you could run git fetch
before push. Then, in my example, the aaaaa2
commit would already exist at client and git push foo
would know that it should not send aaaaa0
and older history.
Read here for the push implementation in protocol.
PS: the recent git commit graph feature might help with it, but I have not tried it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With