Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Is there a way to easily convert a series of tarballs of a source tree into a git repository?



I'm new to git and I have a moderately large number of weekly tarballs from a long running project. Each tarball has on average a few hundred files in it. I'm looking for a git strategy that will allow me to add the expanded contents of each tarball to a new git repository, starting from version 1.001 and going through version 1.650. As of this stage of the project 99.5% of tarball(n) is just a copy of version(n-1) - in other words, a perfect candidate for git. The desired end result is to have only the master branch remaining at the end of the process.

I think I know git well enough to do this "by hand". As I understand it there is no possibility of a merge conflict since there will be no opportunity to change the master before the next version is added and committed. A shell script is my first guess, but I'm not sure how well bash will like it when git checkout branch_n gets processed while bash is executing in branch_n-1. For the purposes of this project the host environment is Ubuntu 10.4, resources available are 8 Gig RAM, 500 Gig Disk space free and 4 CPU processor at 3.ghz .

I don't need someone else to solve the problem but I could use a nudge in the right direction as to how a git expert would approach it. Any advice from someone who's "been there done that" would be appreciated.


PS: I have looked at site's suggested "related questions" and found nothing relevant.

like image 631
Hotei Avatar asked May 03 '10 17:05


3 Answers

Take a look at $GIT_SRC_DIR/contrib/fast-import/import-tars.perl

like image 93
Stefan Näwe Avatar answered Oct 18 '22 17:10

Stefan Näwe

Regarding this comment:

I'm not sure how well bash will like it when git checkout branch_n gets processed while bash is executing in branch_n-1

Are you concerned about two operations running concurrently and getting in each others' way? This shouldn't be a problem unless you intentionally run operations in parallel.

Assuming the tarballs follow a linear evolution, branching shouldn't come into this at all.

The process should be fairly straightforward:

  1. git init
  2. untar ball _n_
  3. git add --all .; git commit (with appropriate flags)
  4. git tag -a v1.001 -m "Version 1.001."
  5. rm -rf * (to handle deletions in the history; you want to leave .git intact, of course)
  6. goto 2
like image 24
Marcelo Cantos Avatar answered Oct 18 '22 17:10

Marcelo Cantos

What I would do in this situation, as you have tarballs that are in the end 'tagged versions':

  1. create empty git repository
  2. extract a tarball to that directory overwriting any files
  3. add all files git add .
  4. git commit -a -m 'version foo'
  5. git tag current version
  6. remove all files
  7. repeat from step 2 for each tarball

In your case it's not necessary to create branches as all your tarballs are distinct, successive versions; each iteration overwrites previous one.

like image 33
Marcin Gil Avatar answered Oct 18 '22 17:10

Marcin Gil