Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bamboo Limit Concurrent Builds Across Branches

We have a small number of shared databases for integration tests and a large number of branches that share these. Is there any way to prevent Bamboo from concurrently trying to run multiple branches that use the same database?

When builds in multiple branches run in parallel they clobber each other and fail.

like image 339
shonky linux user Avatar asked Apr 11 '16 00:04

shonky linux user


1 Answers

There are outstanding feature requests for this BAM-12071 and BAM-2423 waiting on Atlassian to implement a solution.

In the meantime we devised a quick and dirty workaround for this based on using old fashioned file (actually directory) locking. Each resource is defined with a variable name gatekeeper.resource in the job or branch configuration, At the beginning of a build process a "Gatekeeper" stage checks that the required resource is free using a directory name in a common file on a common server. While the directory name exists the resource is in use. The first task of the subsequent build stage creates the resource name as an empty directory, and a final task removes it. Other builds cannot proceed past the first stage until the resource is free, stopping concurrent builds. The downside is that it does tie up a local bamboo agent, and is not completely foolproof but does work for us 99% of the time. It even works across build plans if the resource variable is defined correctly.

Its defined as a SSH task against a linux instance:

# This Gatekeeper stage prevents concurrent builds against a resource 
# by looking for a directory instance in a common file area.
# If the directory exists the build cannot proceed until it disappears.
# The build sleeps as long as the directory exists.
#
# The first task in the subsequent stage is to create the directory, and 
# a final task in the build removes it.
# As a failsafe a background half-hourly cron job should remove lock 
# dirs if they exceed 3 x the build time.
#########################################################
# Wait for a random number of seconds 20-120 to reduce (but not eliminate) the chance that multiple competing branch
# builds triggered by timers both see the dir gone and start the unit test job at once and then proceed to clobber each other (i.e a race condition)
# note: bamboo expects output every 3 minutes so do not increase beyond 180 seconds
SLEEPYTIME=$(( ( RANDOM % 100 ) + 20 ))
echo SLEEPYTIME today is $SLEEPYTIME
sleep $SLEEPYTIME
# Wait for the Gatekeeper lock dir to disappear... or be older than 3 hours (previous build may have hung)
file=/test/atlassian/bamboo-gatekeeper/inuse-${bamboo.gatekeeper.resource}
while [ -d "$file" ]
do
  echo $(date +%H:%M:%S) waiting $SLEEPYTIME seconds...
  sleep $SLEEPYTIME
done
exit 0

First job task of the build stage (after the Gatekeeper):

# This will fail if the lock file (actually a directory!) already exists
file=/test/atlassian/bamboo-gatekeeper/inuse-${bamboo.gatekeeper.resource}
mkdir "$file"

Final step of the build stage following a build (successful or otherwise)

file=/test/atlassian/bamboo-gatekeeper/inuse-${bamboo.gatekeeper.resource}
rm -rf "$file"

There is also a failsafe cron clean up task that removes any resource gateway directories older than a few hours (3 in our case). Should not be necessary but prevents builds being tied up indefinitely in case bamboo itself is restarted without running a final task.

# This works in conjunction with bamboo unit tests. It clears any unit test lock files after 3 hours (e.g. build has hung or killed without removing lock file)
15,45 * * * * find /test/atlassian/bamboo-gatekeeper -name inuse* -mmin +180 -delete

gatekeeper.resource can be defined as the name of anything you want. In our case it is a database schema used by integration tests. Some of our branches use a common test environment, other branches have their own instance. This solution stops the branches using the common environment from executing concurrently while allowing branches with their own environment to proceed.

It is not a complete fix to limit concurrent builds to a specific number, but it is sufficient to get us around this issue until Atlassian implement a permanent solution. I hope it helps others.

like image 107
shonky linux user Avatar answered Sep 22 '22 06:09

shonky linux user