Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark job in Kubernetes stuck in RUNNING state

I'm submitting Spark jobs in Kubernetes running locally (Docker desktop). I'm able to submit the jobs and see their final output in the screen.

However, even if they're completed, the driver and executor pods are still in a RUNNING state.

The base images used to submit the Spark jobs to kubernetes are the ones that come with Spark, as described in the docs.

This is what my spark-submit command looks like:

~/spark-2.4.3-bin-hadoop2.7/bin/spark-submit \
    --master k8s://https://kubernetes.docker.internal:6443 \
    --deploy-mode cluster \
    --name my-spark-job \
    --conf spark.kubernetes.container.image=my-spark-job \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
    --conf spark.kubernetes.submission.waitAppCompletion=false \
    local:///opt/spark/work-dir/my-spark-job.py

And this is what kubectl get pods returns:

NAME                                READY   STATUS    RESTARTS   AGE
my-spark-job-1568669908677-driver   1/1     Running   0          11m
my-spark-job-1568669908677-exec-1   1/1     Running   0          10m
my-spark-job-1568669908677-exec-2   1/1     Running   0          10m
like image 927
Victor Avatar asked Oct 22 '25 09:10

Victor


1 Answers

Figured it out. I forgot to stop the Spark Context. My script looks like this now, and at completion, the driver goes into Completed status and the drivers get deleted.

sc = SparkContext()

sqlContext = SQLContext(sc)

# code

sc.stop()
like image 89
Victor Avatar answered Oct 23 '25 23:10

Victor



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!