Has anyone been successful using Amazon Redshift as a source or destination ODBC component in SQL Server Data Tools 2012?
I've installed the PostgreSQL drivers provided by Amazon and have successfully tested a connection in the Windows ODBC driver administrator but keep running into arcane error messages when I choose my saved DSN and try to pull a table listing.
Redshift is based on quite an old version of Postgres (8.0). Postgres has changed quite a bit since then and the Postgres tools have changed with it. When downloading any tools to use with Redshift you will probably need to use previous versions from several years ago.
The table listing problem is particularly annoying but I have yet to find a version of psql
that can properly list Redshift tables. As an alternative you can use the INFORMATION_SCHEMA
tables to find this kind of info, and in my opinion this is what SSIS/SSDT should be doing by default.
I would not expect SSIS to be able to load data into Redshift reliably, i.e. create a Redshift destination. This is because Redshift does not really support INSERT INTO
as a way to load data. If you use INSERT INTO
you will only be able to load ~10 rows per second. Redshift can only load data quickly from S3 or DynamoDB using the COPY
command.
It's a similar story for all other ETL tools I've tried, notably the open source tools Pentaho PDI (aka Kettle) and Talend Open Studio. This is particularly annoying in Talend's case as they have Redshift components but they actually try to use INSERT INTO
for loading. Even Amazon's own ETL tool Data Pipeline does not yet have support for Redshift as 'node'.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With