Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cloned from a colleague's computer, now pull from Bitbucket still downloads a lot

Tags:

git

We have a Git repository that is quite large and we are behind a very slow Internet connection.

My colleague already had a recent copy of the repository, so I did a

git clone him:/home/foo/repo

in the LAN - which is fast :)

After that, he made some changes, so I did a git pull. During that, I had conflicts with I merged.

Next, I made

git remote rename origin him
git remote add <BITBUCKETURL>

I made some changes and tried to

git push origin master

which was rejected (no fast forward).

So I tried

git pull origin

But now, Git wants to download megabytes of data, which I do not understand. I was thinking Git is smart enough to cross match those objects it already has. Right? Additionally, I tried cloning and adding the Bitbucket URL without any merging; same problem.

What should I do to fix this?

EDIT to address the questions in the comments:

  • there are no other branches I am aware of, git pull origin master has the same effect
  • doing the git pull origin master print: remote: Counting objects: 1535 - there is no chance that so many chances were done in the meantime.
  • I did compare the log, there are no changes online (Bitbucket) which are not on the colleague's computer where I cloned from

EDIT (a lot later)

I discovered a lot later, while not being able to verify that, that I might have made a mistake with the remote repository and added a completely different repo. That would explain everything.

like image 284
Alex Avatar asked Jan 16 '16 15:01

Alex


People also ask

Does Git clone download all history?

Cloning an entire repo is standard operating procedure using Git. Each clone usually includes everything in a repository. That means when you clone, you get not only the files, but every revision of every file ever committed, plus the history of each commit.

What happens if you clone existing repository?

Cloning a repository pulls down a full copy of all the repository data that GitHub.com has at that point in time, including all versions of every file and folder for the project. You can push your changes to the remote repository on GitHub.com, or pull other people's changes from GitHub.com.

What does Bitbucket clone do?

When you clone a repository, you create a copy of your Bitbucket repository on your local system. Cloning also connects the remote and local repositories so that you can start pushing and pulling changes between both places.

Can you clone a project with Bitbucket?

Clone a Bitbucket repositoryClick + in the global sidebar on the left, and under Get to work select Clone this repository. Select HTTPS from the menu in the upper-right (unless you've already set up your SSH keys). Copy the clone command.


2 Answers

This is not an direct answer, but is's too big for the comment. I just tried to reproduce your situation and it works as expected for me (no download from bitbucket on pull).

Some possible reasons of why this don't work for you:

1) Check the colleague repository - does it have proper remotes setup? I am not sure, but probably git uses remotes metadata to understand relations between repositories (just a guess)

2) Maybe the colleague's repository is not up-to-date with bitbucket? So when you do the pull, it just downloads the new data. Try to update the colleague's repository first.

Here is a shell script I used to check the problem, you can play around with something like this to find out what causes the behavior you see:

# Change this to your repository url
[email protected]:user/project

git clone $BITBUCKET_URL project
# Cloning into 'project'...
# Warning: Permanently added the RSA host key for IP address 'xxx.xxx.xxx.xxx' to the list of known hosts.
# remote: Counting objects: 163, done.
# remote: Compressing objects: 100% (154/154), done.
# remote: Total 163 (delta 53), reused 0 (delta 0)
# Receiving objects: 100% (163/163), 3.62 MiB | 1.30 MiB/s, done.
# Resolving deltas: 100% (53/53), done.
# Checking connectivity... done.

mkdir mycopy
cd mycopy
git clone ../project .
# Cloning into '.'...
# done.
ls
# application.py  database.py  README.md  requirements.txt  static
git remote -v show
# origin    /home/seb/test/gitdist/mycopy/../project (fetch)
# origin    /home/seb/test/gitdist/mycopy/../project (push)

git remote rename origin local
git remote add origin $BITBUCKET_URL
git remote -v show
# local /home/seb/test/gitdist/mycopy/../project (fetch)
# local /home/seb/test/gitdist/mycopy/../project (push)
# origin    [email protected]:owner/project.git (fetch)
# origin    [email protected]:owner/project.git (push)

git pull origin
# Warning: Permanently added the RSA host key for IP address 'xxx.xxx.xxx.xxx' to the list of known hosts.
# From bitbucket.org:owner/project
#  * [new branch]      master     -> origin/master
# You asked to pull from the remote 'origin', but did not specify
# a branch. Because this is not the default configured remote
# for your current branch, you must specify a branch on the command line.

Output for each command is included above, you can see that there was initial repository download into the project folder (this emulates you colleague's repository) and then there is no download in the local repository when I rename the origin, add new origin as bitbucket url and go git pull origin.

Update: checking with two git versions

As mentioned in comments to other answer, there are two versions of git involved - git 1.9.4 on colleagues machine and git 2.1.4 locally. I also have 2.1.4 locally, so I additionally get the 1.9.4 version this way:

git clone git://git.kernel.org/pub/scm/git/git.git 
git checkout v1.9.4
make configure
./configure --prefix=/usr
make all
./git --version
# git version 1.9.4

Now I modified the test script this way:

# Change this to your repository url
[email protected]:bosonz/gameofdata.git 

GIT194=./git/git

$GIT194 --version

$GIT194 clone $BITBUCKET_URL project
# Cloning into 'project'...
# ....
# (the rest is unchanged)

Result - there is still no problem, download from bitbucket is still only done once.

like image 69
Boris Serebrov Avatar answered Sep 30 '22 07:09

Boris Serebrov


I believe rename changes the refs stored for that branch on the filesystem, so that may have modified something which prevents it from linking up the objects without downloading.

You might be able to work around it by temporarily pointing origin back at your colleague's repo, letting it recopy the data it wants, and then pointing origin at bitbucket again. First

git remote set-url him:/home/foo/repo

then git pull origin master would hopefully re-download those same 1535 objects across the LAN. If it does, then you can use set-url again to point it back at bitbucket

git remote set-rul <bitbucket-url>

and at that point git should have all of the needed objects because only the remote url will have changed, so git pull should give already up-to-date.

like image 41
David Ulrich Avatar answered Sep 30 '22 07:09

David Ulrich