Spark Structured Streaming Blue/Green Deployments

Tags:

We'd like to be able to deploy our Spark jobs such that there isn't any downtime in processing data during deployments (currently there's about a 2-3 minute window). In my mind, the easiest way to do this is to simulate the "blue/green deployment" philosophy, which is to spin up the new version of the Spark job, let it warm up, then shut down the old job. However, with structured streaming & checkpointing, we cannot do this because the new Spark job sees that the latest checkpoint file already exists (from the old job). I've attached a sample error below. Does anyone have any thoughts on a potential workaround?

I thought about copying over the existing checkpoint directory to another checkpoint directory for the newly created job - while that should work as a workaround (some data might get reprocessed, but our DB should deduplicate), this seems super hacky and something I'd rather not pursue.

Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: rename destination /user/checkpoint/job/offsets/3472939 already exists
    at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.validateOverwrite(FSDirRenameOp.java:520)
    at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.unprotectedRenameTo(FSDirRenameOp.java:364)
    at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.renameTo(FSDirRenameOp.java:282)
    at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.renameToInt(FSDirRenameOp.java:247)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3677)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename2(NameNodeRpcServer.java:914)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename2(ClientNamenodeProtocolServerSideTranslatorPB.java:587)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
    at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
    at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:1991)
    at org.apache.hadoop.fs.Hdfs.renameInternal(Hdfs.java:335)
    at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
    at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
    at org.apache.spark.sql.execution.streaming.HDFSMetadataLog$FileContextManager.rename(HDFSMetadataLog.scala:356)
    at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.org$apache$spark$sql$execution$streaming$HDFSMetadataLog$$writeBatch(HDFSMetadataLog.scala:160)
    ... 20 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.FileAlreadyExistsException): rename destination /user/checkpoint/job/offsets/3472939 already exists

296

asked Apr 04 '18 20:04

techalicious

1 Answers

It is possible, but it will add some complexity to your application. Starting streams is in general fast, so it is fair to assume, that delay is caused by initialization of static objects and dependencies. In that case you'll need only SparkContext / SparkSession, and no streaming dependencies so process can be described as:

Start new Spark application.
Initialize batch-oriented objects.
Pass message to the previous application to step down.
Wait for confirmation.
Start streams.

At the very high level, the happy path could be visualized as:

enter image description here

Since it is very generic pattern it could be implemented in a different ways, depending on a language and infrastructure:

Lightweight messaging queue like ØMQ.
Passing messages through distributed file system.
Placing applications in an interactive context (Apache Toree, Apache Livy) and using external client for orchestration.

190

answered Oct 19 '22 22:10

Alper t. Turker

Related questions
                            
                                spark importing data from oracle - java.lang.ClassNotFoundException: oracle.jdbc.driver.OracleDriver
                            
                                Does Spark Supports With Clause?
                            
                                Spark persist temp view
                            
                                Spark job failing due to space issue
                            
                                How to deal with array<String> in spark dataframe?
                            
                                Low cpu usage while running a spark job
                            
                                How to use a predicate while reading from JDBC connection?
                            
                                using DataSet.repartition in Spark 2 - several tasks handle more than one partition
                            
                                Does CrossValidator in PySpark distribute the execution?
                            
                                Spark, Scala - How to get Top 3 value from each group of two column in dataframe [duplicate]
                            
                                PATH issue: Could not find valid SPARK_HOME while searching
                            
                                How to (equally) partition array-data in spark dataframe
                            
                                Spark UDF not running in parallel
                            
                                Spark window function on dataframe with large number of columns
                            
                                Passing multiple system properties to google dataproc cluster job
                            
                                What is the difference between a "stateful" and "stateless" system?
                            
                                Spark Structured Streaming app has no jobs and no stages
                            
                                Xml processing in Spark
                            
                                How to pass variables in spark SQL, using python?
                            
                                Difference when serializing a lazy val with or without @transient

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark Structured Streaming Blue/Green Deployments

Tags:

deployment

apache-spark

hadoop

blue-green-deployment

spark-structured-streaming

techalicious

People also ask

1 Answers

Alper t. Turker

Recent Activity

Donate For Us