Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Data Pipeline stuck on Waiting For Runner

My goal is to copy a table in a postgreSQL database running on AWS RDS to a .csv file on Amazone S3. For this I use AWS data pipeline and found the following tutorial however when I follow all steps my pipeline is stuck at: "WAITING FOR RUNNER" see screenshot. The AWS documentation states:

ensure that you set a valid value for either the runsOn or workerGroup fields for those tasks

however the field "runs on" is set. Any idea why this pipeline is stuck?

enter image description here enter image description here

and my definition file:

{
  "objects": [
    {
      "output": {
        "ref": "DataNodeId_Z8iDO"
      },
      "input": {
        "ref": "DataNodeId_hEUzs"
      },
      "name": "DefaultCopyActivity01",
      "runsOn": {
        "ref": "ResourceId_oR8hY"
      },
      "id": "CopyActivityId_8zaDw",
      "type": "CopyActivity"
    },
    {
      "resourceRole": "DataPipelineDefaultResourceRole",
      "role": "DataPipelineDefaultRole",
      "name": "DefaultResource1",
      "id": "ResourceId_oR8hY",
      "type": "Ec2Resource",
      "terminateAfter": "1 Hour"
    },
    {
      "*password": "xxxxxxxxx",
      "name": "DefaultDatabase1",
      "id": "DatabaseId_BWxRr",
      "type": "RdsDatabase",
      "region": "eu-central-1",
      "rdsInstanceId": "aqueduct30v05.cgpnumwmfcqc.eu-central-1.rds.amazonaws.com",
      "username": "xxxx"
    },
    {
      "name": "DefaultDataFormat1",
      "id": "DataFormatId_wORsu",
      "type": "CSV"
    },
    {
      "database": {
        "ref": "DatabaseId_BWxRr"
      },
      "name": "DefaultDataNode2",
      "id": "DataNodeId_hEUzs",
      "type": "SqlDataNode",
      "table": "y2018m07d12_rh_ws_categorization_label_postgis_v01_v04",
      "selectQuery": "SELECT * FROM y2018m07d12_rh_ws_categorization_label_postgis_v01_v04 LIMIT 100"
    },
    {
      "failureAndRerunMode": "CASCADE",
      "resourceRole": "DataPipelineDefaultResourceRole",
      "role": "DataPipelineDefaultRole",
      "pipelineLogUri": "s3://rutgerhofste-data-pipeline/logs",
      "scheduleType": "ONDEMAND",
      "name": "Default",
      "id": "Default"
    },
    {
      "dataFormat": {
        "ref": "DataFormatId_wORsu"
      },
      "filePath": "s3://rutgerhofste-data-pipeline/test",
      "name": "DefaultDataNode1",
      "id": "DataNodeId_Z8iDO",
      "type": "S3DataNode"
    }
  ],
  "parameters": []
}
like image 400
Rutger Hofste Avatar asked Nov 07 '22 04:11

Rutger Hofste


1 Answers

Usually "WAITING FOR RUNNER" state implies that it is waiting for a resource (such as an EMR cluster). You seem to have not set 'workGroup' field. It means that you have specified "What" to do, but have not specified "who" should do it.

like image 115
srb Avatar answered Dec 21 '22 23:12

srb