Currently we warehouse our postgres db using SSIS, but there are certain things we can't do, for example with an ADO.Net provider it isn't possible to use parameters in a data source of the data flow component, we're trying out an OLEDB provider PGNP which looks like it does what we require.
I'd like to know what other options are available and your opinions of them. I've used open talend, but the performance wasn't that good compared to SSIS.
Organizations often find themselves using Postgres as an ETL data source and data sink. Sometimes the source data is needed to be pulled out and stream to BI tools for data analytics and other business data work.
The main problem with ETL as a data integration solution is that it is based in a world where cloud based storage has not yet come onto the scene. Quite simply, it is outdated because it predated cloud storage solutions.
If you've heard of PostgreSQL, there's a reason. It's a useful and common data warehouse tool maintained by an active community. It can also handle more than just one kind of data processing, which makes it a pretty compelling option.
You can try Pentaho Data Integration (PDI, formerly, kettle).
Community Edition is free.
It has a GUI similar to SSIS, easy to use after a short introduction.
It is a Java application, and it uses original, native postgresql JDBC driver - performance should be at least comparable to SSIS.
PDI CE download: http://sourceforge.net/projects/pentaho/files/
We ended up using CloverETL. Does what it says on the tin:
CloverETL® is data integration platform scaling from open source desktop to a commercial cloud cluster. It's a Java-based open platform that helps design, automate, and monitor data integration processes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With