Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to restart an AWS Data Pipeline

I have a scheduled AWS Data Pipeline that failed partway through its execution. I fixed the problem without modifying the Pipeline in any way (changed a script in S3). However, there seems to be no good way to restart the Pipeline from the beginning.

I tried Deactivating/Reactivating the Pipeline, but the previously "FINISHED" nodes were not restarted. This is expected; according to the docs, this only pauses and un-pauses execution of the Pipeline, which is not that we want.

I tried Rerunning one of the nodes (call it x) individually, but it did not respect dependencies: none of the nodes x depends on reran, nor did the nodes that depend on x.

I tried activating it from a time in the past, but received the error: startTimestamp should be later than any Schedule StartDateTime in the pipeline (Service: DataPipeline; Status Code: 400; Error Code: InvalidRequestException; Request ID: <SANITIZED>).

I would rather not change the Schedule node, since I want the Pipeline to continue to respect it; I only need this one manual execution. How can I restart the Pipeline from the beginning, once?

like image 734
Simon Lepkin Avatar asked Dec 29 '25 09:12

Simon Lepkin


1 Answers

So far, the best way to accomplish this that I've found is to Clone the Pipeline, make it On-Demand (instead of Scheduled) and activate that one. This new Pipeline will activate and run immediately. This seems cumbersome, however; I'd be happy to hear a better way.

like image 183
Simon Lepkin Avatar answered Dec 30 '25 23:12

Simon Lepkin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!