I see a lot of sites referring to git, github, svn, subversion etc, but I never really knew what all of those things are. I also hear a lot of terms like 'svn repo', 'commit', and 'push' - I tried googling but it seems that I have so little knowledge about the subject that I don't even know where to get started.
Could someone give me the initial push so I can continue doing research on my own? What are these things all about?
Thanks!
guys: thank you so much for all the really long and encompassing explanations. I wish I could choose more than one answer, but unfortunately SO doesn't allow that (they should have a vote 1st, 2nd, and 3rd place feature or something). thank you all very much!
What is Git? Git is a DevOps tool used for source code management. It is a free and open-source version control system used to handle small to very large projects efficiently. Git is used to tracking changes in the source code, enabling multiple developers to work together on non-linear development.
Many people prefer Git for version control for a few reasons: It's faster to commit. Because you commit to the central repository more often in SVN, network traffic slows everyone down. Whereas with Git, you're working mostly on your local repository and only committing to the central repository every so often.
git-svn is a specialized tool for Git users to interact with Git repositories. It works by providing a Git frontend to an SVN backend. With git-svn, you use Git commands on the local repository, so it's just like using normal Git. However, behind the scenes, the relevant SVN commands are sent to the server.
Version control (a.k.a. revision control).
Consider the following problem. You're working on a project with someone else and you're sharing files. You both need to work on, say, "WhateverController.java". It's a huge file and you both need to edit it.
The most primitive way to deal with this, is to not edit the file at the same time, but then both of you have to be on the same page. When you've got a team, especially if the team has members of dozens or hundreds or thousands (typical for open-source projects), this becomes completely impossible.
An old, primitive "solution" to this problem was to have a checkout/checkin mechanism. When you need to edit a file, you "check it out", and the file is locked so no one else can edit it until you unlock it by "checking it in". This is done through the appropriate software, for example Microsoft's breathtakingly stupid piece of crap SourceSafe. But when people forget to "check the file in", then no one else can edit that file while it's in use. Then someone goes on vacation or leaves the project for some other reason and the result is unending chaos, confusion and usually quite a bit of lost code. This adds tremendous management work.
Then came CVS, and subsequently Subversion, which the authors call "CVS done right", so CVS and Subversion are essentially the same idea. With those, there is no actual check out. You just edit the files you need and check them in. Note that the actual files are stored on a central server, and each user runs the software on their own workstations as well. This location on the server is called a repository.
Now, what happens if two people are working on the same file in CVS/Subversion? They are merged, typically using GNU diff and patch. 'diff' is a utility that extracts the difference between two files. 'patch' uses such 'diff' files to patch other files.
So if you're working on WhateverController.java in one function, and I'm working on the same file in a different function, then when you're done with your stuff, you simply check it in, and the changes are applied to the file on the server. Meanwhile, my local copy has no idea of your changes so your changes do not affect my code at all. When I'm done with my changes, I check the file in as well. But now we have this seemingly complicated scenario.
Let's call the original WhateverController.java, file A. You edit the file, and the result is file B. I edit the same file at a different location, without your changes, and this file is file C.
Now we seemingly have a problem. The changes of file B and C are both changes to file A. So in a ridiculously backwards junk like SourceSafe or Dreamweaver will usually end up overriding the change of file B (because it got checked in first).
CVS/Subversion and presumably Git (which I know almost nothing about) create patches instead of just overriding files.
The difference between file A and C is produced and becomes patch X. The difference between A and B is produced and becomes patch Y.
Then patches X and Y are both applied to file A, so the end result is file A + the changes made to B and C on our respective workstations.
Usually this works flawlessly. Sometimes we might be working on the same function in the same code, in which case CVS/Subversion will notify the programmer of a problem, and present the problem within the file itself. Those problems are usually easily fixed, at least I've never had any problem solving them. Graphical utilities such as Visual Studio, Project Builder (Mac OS X) and the such usually show you both files and the conflicts, so you can choose which lines you want to keep and which to throw away... and then you can also edit the file manually if you want to merge the conflict manually.
So in essence, source control is a solution to the problem of multiple people working on the same files. That's basically it.
I hope this explains.
EDIT: There are many other benefits with decent source control systems like Subversion and presumably Git. If there's a problem, you can go back to other versions so you don't have to keep manual backups of everything. In fact, at least with Subversion, if I mess something up or want to take a look at an old version of the code, I can do so without interfering with anyone else's work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With