Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Git recommended for large (>250GB) content repositories

The web-application is a custom-built CMS which has several sub-applications and each one of them has code and content residing in the same directory structure. Due to the application framework's architecture the code and content are intertwined (content depends upon the code for its display and other functionalities) and hence are inseparable. The contents are not stored as BLOB rather they are stored as files and the underlying DB is used to link them. Size of sub-applications ranges from 20GB - 250GB and more (this is the killer).

The web-application will go for some enhancements in code (new sub-applications, bug-fixes etc.) and at the same time users will add/update the contents through the already live system. Hence, a deployment/release process is required and most importantly a version control system needs to be suggested for both code and content.

Git comes to the picture because of reasons - it is open-source & free, ease of branching & merging, its not centralized & hence no single-point-of-failure.

BUT after some initial research in the web, I found out some disappointing facts which are applicable to our application - using Git for large systems like ours is painful (checkout, clone, merge, push, pull) and commands are complicated ("geeky" would be more appropriate) for a developer base which is DVCS ignorant and mostly Windows users.

There is no fixed mindset for Git but if I have to go for a centralized approach (in really WORST case) then what should be the way (CVS & SVN apart). I have read about Perforce being a stable one and is also used in Google (I expect some brashes here!!).

Please share, guide and comment your views. I really require them.

like image 639
kaychaks Avatar asked Jun 16 '09 05:06

kaychaks


People also ask

How big is too big for a Git repo?

The total repository size will be limited to 10GB. You will receive warning messages as your repository size grows to ensure you're aware of approaching any size limits. Eventually, if the repository size exceeds the limit, you will receive an error message and the push will be blocked.

What should be the ideal recommended size of a Git repository?

Repository size We recommend keeping your repository below 10GB for optimal operation. If your repository exceeds this size consider using Git-LFS, Scalar, or Azure Artifacts to refactor your development artifacts.

Can Git handle large files?

Can Git Handle Large Files? Git cannot handle large files on its own. That's why many Git teams add Git LFS to deal with large files in Git.

Is there a limit to GitHub storage?

GitHub limitsOnly the 100 MB threshold is blocked and this is the GitHub file size limit. If you are uploading via browser, the limit is even lower – the file can be no larger than 25 MB. Of course, these are the default settings, but you can extend these limits and add larger files to the repo.


1 Answers

I just happened to be reading this blog post not one minute ago. It's a bit of a rant about the scalability of git.

Edit: Eight years later, and Git has Large File Storage (LFS), and Microsoft is open sourcing Git Virtual File System (GVFS) so they can use git to develop Windows.

like image 109
pgs Avatar answered Sep 19 '22 20:09

pgs