I have a scheduled AWS Data Pipeline that failed partway through its execution. I fixed the problem without modifying the Pipeline in any way (changed a script in S3). However, there seems to be no good way to restart the Pipeline from the beginning.
I tried Deactivating/Reactivating the Pipeline, but the previously "FINISHED" nodes were not restarted. This is expected; according to the docs, this only pauses and un-pauses execution of the Pipeline, which is not that we want.
I tried Rerunning one of the nodes (call it x) individually, but it did not respect dependencies: none of the nodes x depends on reran, nor did the nodes that depend on x.
I tried activating it from a time in the past, but received the error: startTimestamp should be later than any Schedule StartDateTime in the pipeline (Service: DataPipeline; Status Code: 400; Error Code: InvalidRequestException; Request ID: <SANITIZED>).
I would rather not change the Schedule node, since I want the Pipeline to continue to respect it; I only need this one manual execution. How can I restart the Pipeline from the beginning, once?
So far, the best way to accomplish this that I've found is to Clone the Pipeline, make it On-Demand (instead of Scheduled) and activate that one. This new Pipeline will activate and run immediately. This seems cumbersome, however; I'd be happy to hear a better way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With