I am using Kinesis firehose to transfer data to Redshift via S3. I have a very simple csv file that looks like this. The firehose puts it to s3 but Redshift errors out with Delimiter not found error. I have looked at literally all posts related to this error but I made sure that delimiter is included.
File
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:23:56.986397,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:02.061263,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:07.143044,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:12.217930,848.78
OR
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:48:59.993260","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:07.034945","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:12.306484","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:18.020833","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:24.203464","852.12"
Redshift Table
CREATE TABLE stockvalue
( symbol VARCHAR(4),
streamdate VARCHAR(20),
writedate VARCHAR(26),
stockprice VARCHAR(6)
);
Error Error
Just in case, here's what my kinesis stream looks like Firehose
Can someone point out what may be wrong with the file. I added a comma between the fields. All columns in destination table are varchar so there should be no reason for datatype error. Also, the column lengths match exactly between the file and redshift table. I have tried embedding columns in double quotes and without.
Can you post the full COPY command? It's cut off in the screenshot.
My guess is that you are missing DELIMITER ','
in your COPY command. Try adding that to the COPY command.
I was stuck on this for hours, and thanks to Shahid's answer it helped me solve it.
Redshift will always treat your table's columns as lower-case, so when mapping JSON keys to columns, make sure the JSON keys are lower-case, e.g.
{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}
COPY latency(id,name) FROM 's3://<bucket-name>/<manifest>' CREDENTIALS 'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>' MANIFEST json 'auto';
Settings within Firehose must have the column names specified (again, in lower-case). Also, add the following to Firehose COPY options:
json 'auto' TRUNCATECOLUMNS blanksasnull emptyasnull
Below is a snippet showing how to use the put_records functions with kinesis in python:
'objects' passed into the 'put_to_stream' function is an array of dictionaries:
def put_to_stream(objects):
records = []
for metric in metrics:
record = {
'Data': json.dumps(metric),
'PartitionKey': 'swat_report'
};
records.append(record)
print(records)
put_response = kinesis_client.put_records(StreamName=kinesis_stream_name, Records=records)
flush
``
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With