As far as I know, Git's blob has SHA1 hash as file name, in order not to duplicate the file in the repository. For example, if file A has a content of "abc" and has a SHA1 hash as "12345", as long as the content doesn't change, the commits/branches can point to the same SHA1. But, what would happen if file A is modified to "def" to have SHA hash "23456"? Does Git store file A, and modified file A (not the difference only, but the whole file)? <ul> <li>If so, why is that? Isn't it better to store the diff info?</li> <li>If not, how does diff track the changes in a file?</li> <li>How about the other VCS systems - CVS/SVN/Perforce...? </li> </ul> <h3>ADDED</h3> The following from 'Git Community Book' answers most of my questions. It is important to note that this is very different from most SCM systems that you may be familiar with. Subversion, CVS, Perforce, Mercurial and the like all use Delta Storage systems - they store the differences between one commit and the next. Git does not do this - it stores a snapshot of what all the files in your project look like in this tree structure each time you commit. This is a very important concept to understand when using Git.

git stores files by content rather than diffs so in your example, both versions of A ("abc" and "def") would be stored in the object database. <ul> <li>It works out better to store whole objects because it is very easy to see if two versions of the file are the same or not just by comparing their SHAs. Have a look at the git-book for details on how the objects are stored. This works out better because if files were tracked with diffs you would need the entire history of a file to reconstruct it. Easy to do in a centralised system, but not in a distributed system where there can be many different changes to a file.</li> <li>Git performs the diff directly from the objects.</li> </ul>

Git's blob data and diff information

ADDED

The following from 'Git Community Book' answers most of my questions.

It is important to note that this is very different from most SCM systems that you may be familiar with. Subversion, CVS, Perforce, Mercurial and the like all use Delta Storage systems - they store the differences between one commit and the next. Git does not do this - it stores a snapshot of what all the files in your project look like in this tree structure each time you commit. This is a very important concept to understand when using Git.

709

asked Sep 18 '10 21:09

prosseek

1 Answers

git stores files by content rather than diffs so in your example, both versions of A ("abc" and "def") would be stored in the object database.

It works out better to store whole objects because it is very easy to see if two versions of the file are the same or not just by comparing their SHAs. Have a look at the git-book for details on how the objects are stored. This works out better because if files were tracked with diffs you would need the entire history of a file to reconstruct it. Easy to do in a centralised system, but not in a distributed system where there can be many different changes to a file.
Git performs the diff directly from the objects.

answered Sep 19 '22 01:09

Abizern

Related questions
                            
                                Git: switch branches mid-merge
                            
                                Composer run-script of nested packages
                            
                                How can I have a deadline for an issue at GitHub?
                            
                                How to check out a Pull-Request with Jenkins Pipeline?
                            
                                Difference between 'git request-pull' and 'pull request'
                            
                                Git - Undo forced checkout from IntelliJ IDE
                            
                                How do we understand git checkout [dot]
                            
                                Rebase without removing old branch
                            
                                How to select the entire REMOTE file during mergetool?
                            
                                Git master and development out of sync
                            
                                GitHub pull requests vs. Git command line merging
                            
                                Reconstituting Git History After Manual Copy of Repo Files
                            
                                How to reject merge request in GitLab?
                            
                                Where is github authentication token stored on Windows?
                            
                                how can I remove the unwanted objects from my repo after filter-branch --subdirectory-filter
                            
                                Unable to use git-svn in Mac
                            
                                bring git repo up to a certain revision
                            
                                Get history/log of a (potentially removed) file in Git
                            
                                What are the commands for using Git Bash in Windows e.g. when in git diff mode?
                            
                                File permission issue with Mac/Windows when using git

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Git's blob data and diff information

Tags:

git

diff

ADDED

prosseek

People also ask

1 Answers

Abizern

Recent Activity

Donate For Us