I try to have a Kinesis Firehose pushing data in a Redshift table.
The firehose stream is working and putting data in S3.
But nothing arrive in the destination table in Redshift.
How can I troubleshoot this ?
Kinesis Data Firehose delivers your data to your S3 bucket first and then issues an Amazon Redshift COPY command to load the data into your Amazon Redshift cluster. Specify an S3 bucket that you own where the streaming data should be delivered. Create a new S3 bucket, or choose an existing bucket that you own.
Redshift is one such destination supported by Kinesis and data can be streamed from Kinesis to Redshift. Amazon's Redshift is a fully managed cloud-based data warehouse service from the Amazon Web Services(AWS) family. It is designed to store petabytes of data in its data warehouse storage.
Q: What is a record in Kinesis Data Firehose? A record is the data of interest your data producer sends to a delivery stream. The maximum size of a record (before Base64-encoding) is 1024 KB.
In the end, I made it work by deleting and re-creating the Firehose stream :-/ Probably the repeated edits via the web console made the thing unstable.
But here are troubleshooting guidelines :
At this point, you should be able to see the connection attempts in Redshift logs :
select * from stl_connection_log where remotehost like '52%' order by recordtime desc;
Check that the Redshift user used by Firehose has enough privileges on the target table :
select tablename,
HAS_TABLE_PRIVILEGE(tablename, 'select') as select,
HAS_TABLE_PRIVILEGE(tablename, 'insert') as insert,
HAS_TABLE_PRIVILEGE(tablename, 'update') as update,
HAS_TABLE_PRIVILEGE(tablename, 'delete') as delete,
HAS_TABLE_PRIVILEGE(tablename, 'references') as references
from pg_tables where schemaname='public' order by tablename;
Then you can check if the COPY command is run :
select * from stl_query order by endtime desc limit 10;
Then check load errors, or server errors :
select * from stl_load_errors order by starttime desc;
select * from stl_error where userid!=0 order by recordtime desc;
If you have format problems in your data, or in the COPY options, or a mismatch between your data and the target columns, you should at least see the COPY attempts, and some load errors.
If you're still stuck, with nothing appearing in those log tables, try deleting and recreating the whole firehose stream, as there may be some bugs related to the web console. (This step worked for me)
Go to the IAM role(firehose_delivery_role
) auto-created during the Kinesis Firehose setup and make sure that the following roles are attached:
AmazonS3FullAccess
AmazonRedshiftFullAccess
AmazonKinesisFullAccess
AmazonKinesisFirehoseFullAccess
There is a bug that omits the S3 credentials in IAM, leaving the Kinesis setup unable to work.
Also verify that you in fact see the data files accumulating in S3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With