Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Continuous Integration Workflow & SVN

OK this may be a long one.

I'm trying to standardize and professionalize the setup we have at my workplace for doing live updates to our software. currently, everything is manual; so we have an opportunity to start from scratch.

i have installed jenkins, have repositories to work with, have a job template (based on http://jenkins-php.org) and have the current workflow in mind:

  • main repository, 'trunk' sat on the development server in its own virtual host
  • each developer creates a branch for a specific bug/enhancement/addition, with its own virtual host

1) First question: at this stage, is it recommended practice to run a jenkins job/build once the developer commits to his branch? which would run unit tests and some other stuff (as per the template linked to above)

  • once the developer branch has been tested, approved, etc - lead developer does a merge of that branch into trunk

2) at this stage, i would like the jenkins job to be re-run again once the merge is complete. is this possible, and the correct way to do this?

  • once that merge process has been tested and approved, we then do a deployment (im guessing this can be added as a task/target to the deploy job which occurs automatically upon all tests being passed in the step above?) to the live site.

3) i have read somewhere that people also have a trunk checked out on the live site, and the deploy task simply does a svn update rather than doing an FTP task of files. again, is this the correct way to do this?

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
im also slightly confused by the jenkins workspace. the template i am using is set to dump a folder of build/ into the workspace. with SVN set as an option, this also seems to do a checkout of the project into the workspace. what exactly is the intention of the workspace and what you do with it once its populated?

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
UPDATE:

on deployment to live servers ... it seems i have 3 options:

1) have a checked out copy of trunk on the live site, .svn files and directories hidden with .htaccess, and run an svn update once ready. the advantages of this are, its fast, it can be rolled back. the downsides are, security issues?

2) svn export to either the live folder directly (taking the site down temporarily during the process) or some other folder and change the apache vhost to the new location

3) use rysnc or similar tool?

which is the better option, or is there a better one i am missing?

like image 515
djjjuk Avatar asked Mar 13 '13 11:03

djjjuk


People also ask

What is the flow of CI CD pipeline?

What's a CI/CD pipeline? A pipeline is a process that drives software development through a path of building, testing, and deploying code, also known as CI/CD. By automating the process, the objective is to minimize human error and maintain a consistent process for how software is released.

What are the four stages of the continuous delivery workflow?

The CI/CD pipeline combines continuous integration, delivery and deployment into four major phases: source, build, test and deploy. Each phase uses highly detailed processes, standards, tools and automation.

What are the steps of continuous integration?

Continuous integration in five steps Get a CI service to run those tests automatically on every push to the main repository. Make sure that your team integrates their changes everyday. Fix the build as soon as it's broken. Write tests for every new story that you implement.

What is continuous delivery workflow?

As the name suggests, a continuous delivery pipeline is an implementation of the continuous paradigm, where automated builds, tests, and deployments are orchestrated as one release workflow. Put more plainly, a CD pipeline is a set of steps your code changes will go through to make their way to production.

How do I create custom continuous integration (CI) workflows?

You can create custom continuous integration (CI) workflows directly in your GitHub repository with GitHub Actions. Continuous integration (CI) is a software practice that requires frequently committing code to a shared repository.

What is continuous workflows overview?

Continuous Workflows Overview. For continuous integration, there is a need to have a repository where in the code could be saved, retrieved and maintained. The repository must be good enough to provide the developers with a powerful version controlling system.

What is continuous integration in software testing?

Continuous integration is the automation of building and testing code each time a change is made in the codebase, and then committing the code back to your central code repository. For each new code commit that is made, an automated build and test process — which is also called a “pipeline”— is triggered.

What is continuous integration in DevOps?

Continuous integration (CI) is the practice of automating the integration of code changes from multiple contributors into a single software project. It’s a primary DevOps best practice, allowing developers to frequently merge code changes into a central repository where builds and tests then run.


1 Answers

Usually, with Continuous Integration, all work done by all developers is done on a single branch (or on trunk). In fact, Subversion was designed with this type of workflow in mind. Having everyone work on the same branch may sound scary. After all, what about collisions? What if I want to include one bug/enhancement/addition, but not another in my next release?

My experience is that forcing all development to take place on a single branch simply works better. It forces developers to make small, careful changes and to work with each other. The order that issues (bug fixes and enhancements) are handled must be done at the beginning of the development cycle instead of attempting to pick and choose them at the end. Managers like the flexibility of picking and choosing at the end, but it means a massive merge cycle a few days before a release which usually results in rushed testing.

If you do decide to use private branching, I would recommend that you setup two Jenkins instances. The first will be the official one. You build off of trunk and the release branches, but not developer branches. This Jenkins will do all unit testing. It will store the artifacts required for a release. It will report on the tests. It will be where your QA team will pull releases for testing. Only you are allowed to setup jobs on this Jenkins.

The other will be for the developers. They can setup a job and run tests on it if they want. It will be for their branches -- the private branches or bug fix branches.

In answer to your first question: I wouldn't bother with running Jenkins on private branches at all. Private branches used to be called sandboxes because developers could play their. The joke was that developers pretty much did in their sandbox what kitty cats did in sandboxes. Enforcing continuous integration on a private branch really takes away the purpose of that branch. You really only care once the code is delivered to trunk.

That's why I recommend two Jenkins setup. The first is for you. It will execute a build whenever a commit/check in happens. The second is for the developers for their private branch. They'll setup jobs if they want and execute the build when they want. Its sole purpose is to help developers make sure that everything will work once their code is delivered to the trunk.

Doing it this way completely avoids question #2: You always build after code is delivered to the trunk because that's when a commit is done.

The last one, how code is placed on the server is more of a mystery. I like Jenkins creating a delivery artifact. You can talk about what is released via the Jenkins build number.

"Let's release build #25 on the production server."

"Wait, QA never tested build #25, but they tested build #24."

"Okay, let's release build #24, I see it has issue US232 fixed in it anyway."

By the way, we use curl or wget to pull the software off of Jenkins and onto our server. In fact, we have a deploy.sh script we use. You pull the deploy.sh script from Jenkins, and run that. This automatically pulls down the right build from Jenkins and installs it. It shuts down the server, backups up the old install, installs the new software, restarts the server, and then reports back the results.

However, there's something about the idea of Subversion doing your deliveries for you. Various methods are used. One is that Subversion does an update automatically at a particular time off of a production branch. You put code on the production branch, and at 2:00am every morning, Subversion will deliver your code. It's a neat idea, and I've done that -- especially when you're talking about PHP which doesn't have to be compiled.

I prefer the first method since you have more control and forces you to only deploy Jenkins builds. Plus, there are usually some issue that will cause the svn update method of deployment to fail (collision, unversioned file that needed to be removed, etc.). And of course, this will only happen at the most critical time. And, of course, it will be a spectacular failure that happens right before your boss fills out your annual review.

So, in response to your third question. My preferred method is not to involve Subversion. Instead, I ftp the built artifacts (using wget and curl) directly off the Jenkins server, and run a true deployment script that handles everything that's required.

By the way, we are looking at various deployment tools like LiveRebel that will integrate with Jenkins. The idea is that Jenkins will build the deliverable package and deploy it to LiveRebel, then IT can use LiveRebel to deploy it to our servers. We aren't sure whether each Jenkins build will be deployed to LiveRebel, or we will use the build promotion plugin to let QA select the builds to deploy to LiveRebel. The second will prevent us from deploying builds that QA hasn't certified.

ADDENDUM

Thanks for the reply and insight. the reason i was looking at per-task branching is for a number of reasons:

I'll respond to each of your reasons...

1) - allows tasks to be done in isolation from the main trunk.

And tasks can be done in isolation from each other too. The problem is that the resulting code isn't isolated from these tasks and these tasks may be incompatible with each other.

A lot of managers like to think this isolation will allow them to pick and choose which tasks to include in the next release and they'll be able to do this at release time. One of the first true CM packages was called Sablime from AT&T and it was based upon this very philosophy.

In Sablime, you have a Generic which is what they call a release. This is the base for all changes. Each change is assigned a Modification Request (MR Number), and all work must be done upon an MR.

You create the next generic by taking the old baseline generic, adding in the selected MRs, and tada! A new Generic can be created. Sounds simple: Old baseline + selected changes = new baseline.

Unfortunately, MRs would affect files the same files. In fact, it wasn't uncommon for the new generic to contain a version of a file that was not written by an actual developer. Also, one MR ended up depending upon another. A manager would declare that MRs 1001, 1003, and 1005 were in the next release. We try to add these MRs into the baseline, and we find out that MR 1003 is dependent upon MR 1002 which is also dependent upon MR 1008 which is one we don't want release. We spend the next week trying to work out a set of releasable MRs and end up releasing software that was never thoroughly tested.

In order to solve this issue, we ended up with smaller changes between baselines. We would do a new Generic each week, sometimes two of them. This allowed us to make sure merging worked, and made sure that dependent MRs were included first before the MRs that were dependent upon them. However, it also eliminated the entire pick and choose concept. All that was left was a lot of overhead built into Sablime.

2) no time constraints - each task can take its time being completed, and doesnt impact on other tasks.

Tasks will always impact each other unless they are for two completely different software packages that run on two entirely different machines with two separate databases.

All tasks have a time constraint because there is a cost associated with time, and a benefit associated with that task. A task which takes a long time, but provides little benefit is not worth doing.

One of the jobs of development is to prioritize these tasks: Which should be done first, and which should be done later. Tasks that take too long should be broken up into subtasks.

In Agile development, no task is suppose to take up more resources than the sprint can afford. (A sprint is a mini-release and usually covers a two week period.) In that two week period, a developer has a certain number of points they can fulfill (Points are sort of related to hours, but not really. Just assume that one point represents X hours of work for this thought experiment.) If a developer can do 15 points per week, a task that takes 30 points is too big for the sprint and must be divided up into subtasks.

3) - release and deployments can be done also on a per-task basis, rather than waiting for other tasks to be completed and then do a multi-task release at a fixed point in time (what we are trying to get away from that)

None of what I'm saying means you can't do task based development. Sites that use Git do this a lot. Agile process assumes this. None of this implies that you go back to the waterfall method where no one is allowed to touch a keyboard until every single excruciating detail has been laid out. However, you can't simply do fifty separate tasks and then the day before a release pick and choose which ones you want to include. You don't release tasks, you release a software product.

Use task branching. However as far as you are concerned, a task is not complete until its changes are in trunk. Developers are responsible for this. They must rebase (merge the changes from trunk into their branch), test their rebased code and then deliver (merge their changes back to trunk) before the project can consider that task as complete.

This is why I tell you that you can have two Jenkins instances: One for the official build on trunk and the release branches, and one for developers to do their task building. Developers can have all the fun in the world with their Jenkins and their branches and their development. It just doesn't count until it's on trunk and you build it with your Jenkins.

Think of Git and how Linux works. There is one official Git repository for Linux. You pull from this repository and that creates your own Git repository on your machine. You can share this Git repository with your friends. You can do what ever your heart desires. You can create another Git repo from your Git repo, or pull another copy of the official Git repo or find someone else with a Git repo of Linux and pull from there.

However, all of the changes you do will not be considered part of Linux until those changes are pushed back to the one and only official Git repository.

like image 180
David W. Avatar answered Oct 14 '22 04:10

David W.