I have a heavily parallelized build across 45 slaves (one master that just handles launches).
The problem I am running into is that about 3% of the jobs disappear.
The project setup is a "master" job that then launches (via the parameterized job plugin) N jobs across N slaves. Most of the time, the console output for the master job is correct with regards to job numbers of the distributed build steps.
Occasionally, however, the job indicated in console actually belongs to a completely different build.
Where do I even start looking to track this down? The jenkins logs are eerily empty of any information about failed jobs or problems launching jobs.
My best guess at the moment is that the missing jobs were actually queued waiting for executors when something happened to remove them. But I have no evidence to support this.
Thoughts, suggestions, helpful links all greatly appreciated,
Here's how you can get more info: http://[jenkins_server]/log/
-> Add new log recorder -> enter a name of your choice -> OK -> Add -> enter hudson.model.Run
as Logger -> set Log Level to all -> Save.
Now http://[jenkins_server]/log/[your log name]/
will provide you with more info as far as running your jobs is concerned.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With