Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jenkins suddenly started failing to provision agents in Amazon EKS

We are using the Kubernetes plugin to provision agents in EKS and around 8:45 pm EST yesterday, with no apparent changes on our end (I'm the only admin, and I certainly wasn't doing anything then) we started getting issues with provisioning agents. I have rebooted the EKS node and the Jenkins master. I can confirm that kubectl works fine and lists 1 node running.

I'm suspecting something must have changed on the AWS side of things.

What's odd is that those ALPN errors don't show up anywhere else in our logs until just before this started happening. Google around, I see people saying to ignore these "info" messages because the Java version doesn't support ALPN, but the fact that it's complaining about "HTTP/2" makes me wonder if Amazon changed something on their end to be HTTP/2 only?

I know this might seem too specific for a SO question, but if something did change with AWS that broke compatibility, I think this would be the right place.

From the Jenkins log at around 8:45:

INFO: Docker Container Watchdog check has been completed
Aug 29, 2019 8:42:05 PM hudson.model.AsyncPeriodicWork$1 run
INFO: Finished DockerContainerWatchdog Asynchronous Periodic Work. 0 ms
Aug 29, 2019 8:45:04 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
INFO: Excess workload after pending Kubernetes agents: 1
Aug 29, 2019 8:45:04 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
INFO: Template for label eks: Kubernetes Pod Template
Aug 29, 2019 8:45:04 PM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Aug 29, 2019 8:45:04 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning Kubernetes Pod Template from eks with 1 executors. Remaining excess workload: 0
Aug 29, 2019 8:45:14 PM hudson.slaves.NodeProvisioner$2 run
INFO: Kubernetes Pod Template provisioning successfully completed. We have now 3 computer(s)
Aug 29, 2019 8:45:14 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Created Pod: jenkins-eks-39hfp in namespace jenkins
Aug 29, 2019 8:45:14 PM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Aug 29, 2019 8:45:14 PM io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1 onFailure
WARNING: Exec Failure: HTTP 403, Status: 403 -
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
        at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229)
        at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196)
        at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
        at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Aug 29, 2019 8:45:14 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
WARNING: Error in provisioning; agent=KubernetesSlave name: jenkins-eks-39hfp, template=PodTemplate{inheritFrom='', name='jenkins-eks', namespace='jenkins', slaveConnectTimeout=300, label='eks', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], volumes=[HostPathVolume [mountPath=/var/run/docker.sock, hostPath=/var/run/docker.sock], EmptyDirVolume [mountPath=/tmp/build, memory=false]], containers=[ContainerTemplate{name='jnlp', image='infra-docker.artifactory.mycompany.io/jnlp-docker:latest', alwaysPullImage=true, workingDir='/home/jenkins/work', command='', args='-url http://jenkins.mycompany.io:8080 ${computer.jnlpmac} ${computer.name}', ttyEnabled=true, resourceRequestCpu='', resourceRequestMemory='', resourceLimitCpu='', resourceLimitMemory='', envVars=[KeyValueEnvVar [getValue()=/home/jenkins, getKey()=HOME]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@2043f440}], envVars=[KeyValueEnvVar [getValue()=/tmp/build, getKey()=BUILDDIR]], imagePullSecrets=[org.csanchez.jenkins.plugins.kubernetes.PodImagePullSecret@40ba07e2]}
io.fabric8.kubernetes.client.KubernetesClientException:
        at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:198)
        at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
        at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
        at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
        at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Aug 29, 2019 8:45:14 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
INFO: Terminating Kubernetes instance for agent jenkins-eks-39hfp
Aug 29, 2019 8:45:14 PM hudson.init.impl.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler uncaughtException
SEVERE: A thread (OkHttp Dispatcher/255634) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually indicative of a bug in the code.
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2c315338 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@2bddc643[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
        at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
        at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
        at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632)
        at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:678)
        at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.scheduleReconnect(WatchConnectionManager.java:300)
        at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$800(WatchConnectionManager.java:48)
        at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:213)
        at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
        at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
        at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
        at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
like image 829
Isaac Freeman Avatar asked Aug 30 '19 21:08

Isaac Freeman


People also ask

How does Jenkins work with Kubernetes?

A Kubernetes cluster adds a new automation layer to Jenkins. Kubernetes makes sure that resources are used effectively and that your servers and underlying infrastructure are not overloaded. Kubernetes' ability to orchestrate container deployment ensures that Jenkins always has the right amount of resources available.

What are EKS clusters?

EKS clusters are composed of the following main components—a control plane and worker nodes. Each cluster runs in its own, fully managed Virtual Private Cloud (VPC). The control plane is composed of three master nodes, each running in a different AZ to ensure AWS high availability.


1 Answers

Ran into this today as AWS just pushed the update for the net/http golang CVE for K8s versions 1.12.x. That patch apparently broke the version of the Kubernetes plugin we were on. Updating to the latest version of the plugin 1.18.3 resolved the issue.

https://issues.jenkins-ci.org/browse/JENKINS-59000?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel

like image 134
DiamonW Avatar answered Sep 21 '22 20:09

DiamonW