How to restart an AWS Data Pipeline

Question

I have a scheduled AWS Data Pipeline that failed partway through its execution. I fixed the problem without modifying the Pipeline in any way (changed a script in S3). However, there seems to be no good way to restart the Pipeline from the beginning.

I tried Deactivating/Reactivating the Pipeline, but the previously "FINISHED" nodes were not restarted. This is expected; according to the docs, this only pauses and un-pauses execution of the Pipeline, which is not that we want.

I tried Rerunning one of the nodes (call it x) individually, but it did not respect dependencies: none of the nodes x depends on reran, nor did the nodes that depend on x.

I tried activating it from a time in the past, but received the error: startTimestamp should be later than any Schedule StartDateTime in the pipeline (Service: DataPipeline; Status Code: 400; Error Code: InvalidRequestException; Request ID: <SANITIZED>).

I would rather not change the Schedule node, since I want the Pipeline to continue to respect it; I only need this one manual execution. How can I restart the Pipeline from the beginning, once?

Simon Lepkin · Accepted Answer

So far, the best way to accomplish this that I've found is to Clone the Pipeline, make it On-Demand (instead of Scheduled) and activate that one. This new Pipeline will activate and run immediately. This seems cumbersome, however; I'd be happy to hear a better way.

How to restart an AWS Data Pipeline

Tags:

amazon-web-services

amazon-data-pipeline

Simon Lepkin

1 Answers

Simon Lepkin

Recent Activity

Donate For Us

How to restart an AWS Data Pipeline

Tags:

amazon-web-services

amazon-data-pipeline

Simon Lepkin

1 Answers

Simon Lepkin

Related questions

Recent Activity

Donate For Us