Delimiter not found error - AWS Redshift Load from s3 using Kinesis Firehose

Question

I am using Kinesis firehose to transfer data to Redshift via S3. I have a very simple csv file that looks like this. The firehose puts it to s3 but Redshift errors out with Delimiter not found error. I have looked at literally all posts related to this error but I made sure that delimiter is included.

File

GOOG,2017-03-16T16:00:01Z,2017-03-17 06:23:56.986397,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:02.061263,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:07.143044,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:12.217930,848.78

OR

"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:48:59.993260","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:07.034945","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:12.306484","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:18.020833","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:24.203464","852.12"

Redshift Table

CREATE TABLE stockvalue
( symbol                   VARCHAR(4),
  streamdate               VARCHAR(20),
  writedate                VARCHAR(26),
  stockprice               VARCHAR(6)
);

Error Error
Just in case, here's what my kinesis stream looks like Firehose

Can someone point out what may be wrong with the file. I added a comma between the fields. All columns in destination table are varchar so there should be no reason for datatype error. Also, the column lengths match exactly between the file and redshift table. I have tried embedding columns in double quotes and without.

Jon Ekiz · Accepted Answer

Can you post the full COPY command? It's cut off in the screenshot.

My guess is that you are missing DELIMITER ',' in your COPY command. Try adding that to the COPY command.

flusharcade · Answer

I was stuck on this for hours, and thanks to Shahid's answer it helped me solve it.

Text Case for Column Names is Important

Redshift will always treat your table's columns as lower-case, so when mapping JSON keys to columns, make sure the JSON keys are lower-case, e.g.

Your JSON file will look like:

{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}

And the COPY statement will look like

COPY latency(id,name) FROM 's3://<bucket-name>/<manifest>' CREDENTIALS 'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>' MANIFEST json 'auto';

Settings within Firehose must have the column names specified (again, in lower-case). Also, add the following to Firehose COPY options:

json 'auto' TRUNCATECOLUMNS blanksasnull emptyasnull

How to call put_records from Python:

Below is a snippet showing how to use the put_records functions with kinesis in python:

'objects' passed into the 'put_to_stream' function is an array of dictionaries:

def put_to_stream(objects):
    records = []

    for metric in metrics:
        record = {
            'Data': json.dumps(metric),
            'PartitionKey': 'swat_report'
        };

        records.append(record)

    print(records)

    put_response = kinesis_client.put_records(StreamName=kinesis_stream_name, Records=records)

flush
``

Delimiter not found error - AWS Redshift Load from s3 using Kinesis Firehose

Tags:

amazon-web-services

amazon-s3

amazon-kinesis-firehose

amazon-redshift

Master of none

2 Answers

Jon Ekiz

Text Case for Column Names is Important

Your JSON file will look like:

And the COPY statement will look like

How to call put_records from Python:

flusharcade

Recent Activity

Donate For Us

Delimiter not found error - AWS Redshift Load from s3 using Kinesis Firehose

Tags:

amazon-web-services

amazon-s3

amazon-kinesis-firehose

amazon-redshift

Master of none

2 Answers

Jon Ekiz

Text Case for Column Names is Important

Your JSON file will look like:

And the COPY statement will look like

How to call put_records from Python:

flusharcade

Related questions

Recent Activity

Donate For Us