Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jenkins Pipeline and semaphores

I'm building a Jenkins job that will run all my staging tests continuously, but not all at once (they rely on shared hardware). So, I'm creating parallel jobs, with a semaphore to ensure that only a limited amount are run at once. Here's a simplified version of my pipeline that reproduces the issue:

import java.util.concurrent.Semaphore

def run(job) {
  return {
    this.limiter.acquire();
    try {
      println "running ${job}"
      build job
      println "finished ${job}"
    } finally {
      this.limiter.release();
    }
  }
}

def getJobs() {
  def allJobs = Jenkins.getInstance().getJobNames()
  def stagingJobs = []
  for(String job : allJobs) {
    if (job.startsWith("staging/temp")) {
      stagingJobs.add(job)
    }
  }
  println "${stagingJobs.size()} jobs were found."
  return stagingJobs
}

this.limiter = new Semaphore(2)
def jobs = [:]
for (job in getJobs()) {
  jobs[job] = run(job)
}
parallel jobs

When I run without the semaphores, everything works fine. But with the code above, I get nothing outputted except:

[Pipeline] echo
6 jobs were found.
[Pipeline] parallel
[Pipeline] [staging/temp1] { (Branch: staging/temp1)
[Pipeline] [staging/temp2] { (Branch: staging/temp2)
[Pipeline] [staging/temp3] { (Branch: staging/temp3)
[Pipeline] [staging/temp4] { (Branch: staging/temp4)
[Pipeline] [staging/temp5] { (Branch: staging/temp5)
[Pipeline] [staging/temp6] { (Branch: staging/temp6)

If I view the pipeline steps, I can see the first two jobs start, and their log messages output. However, it seems like the runner never receives a notification that the staging jobs finish. As a result, the semaphore never releases and the other 4 jobs never manage to start. Here's a thread dump mid test, after the downstream builds have definitely finished:

Thread #7
    at DSL.build(unsure what happened to downstream build)
    at WorkflowScript.run(WorkflowScript:9)
    at DSL.parallel(Native Method)
    at WorkflowScript.run(WorkflowScript:38)
Thread #8
    at DSL.build(unsure what happened to downstream build)
    at WorkflowScript.run(WorkflowScript:9)
Thread #11
    at WorkflowScript.run(WorkflowScript:6)
Thread #12
    at WorkflowScript.run(WorkflowScript:6)

Eventually it times out with several java.lang.InterruptedException errors.

Is it possible to use semaphores in a pipeline, or is there a better way to ensure only a portion of jobs run at once? I would rather avoid spinning up nodes for what amounts to a simple test runner.

like image 648
Malcolm Crum Avatar asked May 30 '17 07:05

Malcolm Crum


2 Answers

The Concurrent Step plugin was just released and should work nicely for this use case.

Wtih this, you can simplify your code:

def semaphore = createSemaphore permit:2

def run(job) {
  return {
    acquireSemaphore (semaphore) {
      println "running ${job}"
      build job
      println "finished ${job}"
    }
  }
}

...
like image 84
Patrice M. Avatar answered Sep 28 '22 02:09

Patrice M.


Possible workaround with lock step

Lockable resources plugin has no semaphore capabilities.

It took me a long time to figure out how to squeeze the lock step into semaphore behavior... it would be nice if it could do it on its own. Here's an example...

int concurrency = 3
List colors = ['red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']
Map tasks = [failFast: false]
for(int i=0; i<colors.size(); i++) {
    String color = colors[i]
    int lock_id = i % concurrency
    tasks["Code ${color}"] = { ->
        stage("Code ${color}") {
            lock("color-lock-${lock_id}") {
                echo "This color is ${color}"
                sleep 30
            }
        }
    }

}
// execute the tasks in parallel with concurrency limits
stage("Rainbow") {
    parallel(tasks)
}

The above will create custom locks:

  • color-lock-0
  • color-lock-1
  • color-lock-2

The all concurrent tasks will race for one of the three locks. It's not perfectly efficient (certainly not as efficient as a real semaphore) but it does a good enough job...

Hopefully that helps others.

Limitations

Your pipeline will take as long as your slowest locks. So if you unfortunately have several long running jobs racing for the same lock (e.g. color-lock-1), then your pipeline could be longer than if it were a proper semaphore.

Example,

  • color-lock-0 takes 20 seconds to cycle through all jobs.
  • color-lock-1 takes 30 minutes to cycle through all jobs.
  • color-lock-2 takes 2 minutes to cycle through all jobs.

Then your job will take 30 minutes to run... where as with a true semaphore it would have been much faster because the longer running jobs would take the next available lock in the semaphore rather than be blocked.

Better than nothing; it's what I have so far. Sounds like a good time to open a feature request with the lockable resources plugin.

like image 43
Sam Gleske Avatar answered Sep 28 '22 01:09

Sam Gleske