Pros and cons for keeping code and data in separate repositories

Tags:

We have a project which has data and code, bundled into a single Mercurial repository. The data is just as important the code (it contains parameters for business logic, some inputs, etc.) However, the format of the data files changes rarely, and it's quite natural to change the data files independently from the code.

One advantage of the unified repository is that we don't have to keep track of multiple revisions: if we ever need to recreate output from a previous run, we only need to update the system to the single revision number stored in the output log.

One disadvantage is that if we modify the data while multiple heads are active, we may lose the data changes unless we manually copy those changes to each head.

Are there any other pros/cons to splitting the code and the data into separate repositories?

346

asked Nov 30 '12 06:11

max

1 Answers

Multiple repos:

pros:
- component-based approach (you identify groups of files that can evolve independently one from another)
- configuration specification: you list the references (here "revisions") you need for your system to work. If you want to modify one part without changing the other, you update that list.
- partial clones: if you don't need all components, you can only clone the ones you want (doesn't apply in your case)
cons
- configuration management: you need to track that configuration (usually through a parent repo, registering subrepos)
- in your case, data is quite dependent on certain versions of the projects (you can have new data which doesn't make sense for old versions of the project)

One repo

pros
- system-based approach: you see your modules as one system (project and data).
- repo management: all in one
- tight link between modules (which can makes sense for data)
cons
- data propagation (when, as you mention, several HEAD are active)
- intermediate revisions (not to reflect a new feature, but just because some data changes)
- larger clone (not relevant here, unless your data include large binaries)

For non-binary data, with infrequent changes, I would still keep them in the same repo.

159

answered Oct 22 '22 16:10

VonC

Related questions
                            
                                Does SVN have an equivalent for "hg clone" in Mercurial or "git clone" in Git?
                            
                                Which Distributed Source Control System has the best integration with Windows & Visual Studio?
                            
                                Is it possible to set up a private Mercurial repository on Google Code?
                            
                                Is there any harmful commands using GIT and HG
                            
                                Mercurial/TortoiseHG Merge Trunk Changes into Branch
                            
                                How many people were involved in a project? Based on Revision Control System
                            
                                Repository organization for Hadoop project
                            
                                hg pull from bitbucket using fabric
                            
                                "Hg to Hg (Gateway) to SVN" compared to "Git to Git (Gateway) to SVN"
                            
                                Windows SDK - C# - Debugging process exiting with error code -1073741502
                            
                                Two-way beautifier integration with Mercurial
                            
                                TortoiseHg: Overlay icon issues (Windows)?
                            
                                Ignore on commit in TortoiseHg
                            
                                Mercurial Setup for Lotus Domino Designer 8.5.3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pros and cons for keeping code and data in separate repositories

Tags:

dvcs

mercurial

development-environment

max

People also ask

1 Answers

VonC

Recent Activity

Donate For Us