Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Have Jenkins Fail Fast When Node Is Offline

Tags:

jenkins

groovy

I have a MultiJob Project (made with the Jenkins Multijob plugin), with a series of MultiJob Phases. Let's say one of these jobs is called SubJob01. The jobs that are built are each configured with the "Restrict where this project can be run" option to be tied to one node. SubJob01 is tied to Slave01.

I would like it if these jobs would fail fast when the node is offline, instead of saying "(pending—slave01 is offline)". Specifically, I want there to be a record of the build attempt in SubJob01, with the build being marked as failed. This way, I can configure my MultiJob project to handle the situation as I'd like, instead of using the Jenkins build timeout plugin to abort the whole thing.

Does anyone know of a way to fail-fast a build if all nodes are offline? I could intersperse the MultiJob project with system Groovy scripts to check whether the desired nodes are offline, but that seems like it'd be reinventing, in the wrong place, what should already be a feature.

like image 929
pgn674 Avatar asked Sep 23 '13 16:09

pgn674


People also ask

What happens when a Jenkins agent is offline and what is the best practice in that situation?

What happens when a Jenkins agent is offline and what is the best practice in that situation? When a job is tied to a specific agent on a specific node, the job can only be run on that agent and no other agents can fulfill the job request.

Why Jenkins node is offline?

Jenkins monitors each attached node for disk space, free temp space, free swap, clock time/sync and response time. A node is taken offline if any of these values go outside the configured threshold.

Can a single Jenkin job run on multiple nodes?

Yes, it will build the job on that single node. Usually we have labels that are generic i.e. multiple nodes use same label name as per their use. (we don't use unique label names for all nodes, as node name is already unique and serves that purpose).


1 Answers

I ended up creating this solution which has worked well. The first build step of SubJob01 is an Execute system Groovy script, and this is the script:

import java.util.regex.Matcher
import java.util.regex.Pattern

int exitcode = 0
println("Looking for Offline Slaves:");
for (slave in hudson.model.Hudson.instance.slaves) {
 if (slave.getComputer().isOffline().toString() == "true"){
 println('  * Slave ' + slave.name + " is offline!");
   if (slave.name == "Slave01") {
     println('    !!!! This is Slave01 !!!!');
     exitcode++;
   } // if slave.name
  } // if slave offline
} // for slave in slaves

println("\n\n");
println "Slave01 is offline: " + hudson.model.Hudson.instance.getNode("Slave01").getComputer().isOffline().toString();
println("\n\n");

if (exitcode > 0){
 println("The Slave01 slave is offline - we can not possibly continue....");
 println("Please contact IT to resolve the slave down issue before retrying the build.");
 return 1;
} // if

println("\n\n");
like image 154
pgn674 Avatar answered Oct 07 '22 08:10

pgn674