I have a Flink job in which I am reading files from a folder and dumping it in database. New files will come in that folder daily. I have enabled checkpointing so that if for any reason Flink job stops and I need to restart, Flink job should not read the files that already been read. I have added below lines in my code but when I restart my job, Flink job again reads all the files. <pre class="prettyprint lang-java prettyprint-override"><code>env.setStateBackend(new FsStateBackend("file:///C://Users//folder")); env.enableCheckpointing(10L); </code></pre>

@fabian-hueske covered all aspects of whats going on with your "planned" restart You should plan to cancel the job with a savepoint <pre class="prettyprint"><code>flink cancel --withSavepoint ${SAVEPOINT_DIR} ${JOBID} </code></pre> Restart the new Job with the savepoint from prev step.. <pre class="prettyprint"><code>flink run -s ${SAVE_POINT} -p ${PARALLELISM} -d ${JOB_JAR} ${JOB_ARGS} </code></pre>

Apache Flink: My application does not resume from a checkpoint when I restart it

Tags:

apache-flink

flink-streaming

I have a Flink job in which I am reading files from a folder and dumping it in database. New files will come in that folder daily.

I have enabled checkpointing so that if for any reason Flink job stops and I need to restart, Flink job should not read the files that already been read.

I have added below lines in my code but when I restart my job, Flink job again reads all the files.

env.setStateBackend(new FsStateBackend("file:///C://Users//folder"));
env.enableCheckpointing(10L);

423

asked Jan 23 '19 10:01

Ankit

2 Answers

Checkpoints are a mechanism to recover from failures during the execution of an application, not to resume an application that was explicitly canceled.

If you have a running application and the execution fails (for whatever reason), Flink will try to recover the application by restarting it and initializing the state of the operators from the last checkpoint. If the recovery fails (for example because not enough processing slots are available), the job is considered as failed.

If you manually cancel an application and restart it, Flink will not a checkpoint to initialize the state of the operators. In fact, Flink will (by default) delete all checkpoints when you cancel an application.

The concept you are looking for are savepoints. Savepoints are very similar to checkpoints but manually triggered by the user and not automatically deleted when the application is explicitly canceled. When starting an application, you can start it from a savepoint which means that the operator state is initialized from the savepoint.

There are also different restart strategies available to configure how often and in which intervals Flink tries to restart a failed application.

answered Oct 17 '22 21:10

Fabian Hueske

@fabian-hueske covered all aspects of whats going on with your "planned" restart

You should plan to cancel the job with a savepoint

flink cancel --withSavepoint ${SAVEPOINT_DIR} ${JOBID}

Restart the new Job with the savepoint from prev step..

flink run -s ${SAVE_POINT} -p ${PARALLELISM} -d ${JOB_JAR} ${JOB_ARGS}

answered Oct 17 '22 20:10

Bon Speedy

Related questions
                            
                                Why using apache kafka in real-time processing
                            
                                Apache Flink: How often is state de/serialized?
                            
                                Apache Flink: Using filter() or split() to split a stream?
                            
                                Flink Custom Partition Function
                            
                                Compose-Docker pull specific image:tag from a yml file service
                            
                                Two questions on Flink externalized checkpoints
                            
                                How to increase Flink taskmanager.numberOfTaskSlots to run it without Flink server(in IDE or fat jar)
                            
                                Storage in Apache Flink
                            
                                How to write the content of a Flink var to screen in Zeppelin?
                            
                                Flink dynamic scaling
                            
                                java.lang.NoSuchMethodException for init method in Scala case class
                            
                                Test csv files equality with random line order (Junit)
                            
                                Apache Flink: How to enable "upsert mode" for dynamic tables?
                            
                                Flink window state size and state management
                            
                                Apache Flink example job fails to run with "Job not found"
                            
                                zipWithIndex on Apache Flink
                            
                                Fllink Web UI not displaying records received in a Custom Source implementation
                            
                                Flink exactly-once message processing
                            
                                Flink Error - Key group is not in KeyGroupRange
                            
                                What is Apache Flink's detached mode?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With