Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

EMR activity stuck in Waiting_For_Runner state

I am creating a data pipeline to export dynamoDB table to S3 bucket.I used the standard template to use for this in data pipeline console. I ha verified that the runsOn field is set to the name of EMR cluster to be started. However, The EMR activity status is still as Waiting_For_Runner...Any ideas why is this so??

Thanks!!!

like image 704
user3610975 Avatar asked May 08 '14 07:05

user3610975


People also ask

What is pipeline status?

The pipeline status is simply an overview of a pipeline; to see more information, view the status of individual pipeline components. You can do this by clicking through a pipeline in the console or retrieving pipeline component details using the CLI.

What is worker group in data pipeline?

The --workerGroup option specifies the name of your worker group, which must be the same value as specified in your pipeline for tasks to be processed. The --region option specifies the service region from which to pull tasks to execute.

What is data pipeline AWS?

AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals.


1 Answers

Waiting_For_Runner means datatpipeline is trying to connect to EMR.

Few reasons you can check:

  1. IAM permissions between EMR and Data pipeline(Roles). Here's a link!
  2. Check Task runner is running or not on master instance
    $ps -ef | grep workerGroup (Master Instance)
  3. Check --workgroup name in EMR(you can see in Task runner process) and compare it with the name of workgroup in datapipeline.
like image 82
andy_l Avatar answered Sep 21 '22 12:09

andy_l