I am thinking of moving our SSIS ETLs to Azure Data Factory. My arguments in favour of such leap are:
Our sources and targets are already in the cloud. ADF is cloud native so it seems at good fit.
ADF is a service are therefore we could consume and pay for it on demand. SSIS implies licensing costs, and doesn't lend lend it itself naturally for on-demand consumption (we thought of using DevOps to spin ETL servers on an ad-hoc basis)
Generating ETL code programmatically with SSIS requires very specific skills such as BIML or the DTS API. By moving to ADF I am hoping the combination of JSON and the TSQL and C# in USQL will make the necessary skills more generic.
I am hoping members of the community can share their experiences and thus help me come to a decision.
PRO TIP: No, Azure data/factory does not replace SSIS. SSIS is a different product with different capabilities. For example, SSIS is limited in its ability to handle large data sets and it can be difficult to manage and update. Azure data/factory seems to address these issues.
You can now move your SQL Server Integration Services (SSIS) projects, packages, and workloads to the Azure cloud. Deploy, run, and manage SSIS projects and packages in the SSIS Catalog (SSISDB) on Azure SQL Database or SQL Managed Instance with familiar tools such as SQL Server Management Studio (SSMS).
With Azure Data Factory, it is fast and easy to build code-free or code-centric ETL and ELT processes. In this scenario, learn how to create code-free pipelines within an intuitive visual environment.
The answers to this old post are quite outdated. My comments below are related to ADF version 2.
First of all, ADF has the capability to run SSIS packages, so moving your legacy ETL processes there and moving to ADF incrementally is not only possible but recommended. You don't want to change everything with every new piece of technology that comes out. You can then only implement new or modified ETL processes on ADF activities.
Secondly, although maybe not completely there yet, with ADF dataflows you can do transformations you can do with SSIS. There are still some missing bits and pieces, but most of the commonly used functionality is there.
ADF authoring does not require Visual Studio. It does need specific skills but I found the learning curve not to be steep. Documentation and best practices are still a bit lacking in certain areas, but someone already experienced in database / data warehouse architecture and ETL will find it relatively easy. The best thing about it is that most things can be done visually without messing with the code (which is just simple JSON).
Furthermore, ADF integrates with Azure Devops and uses Git for versioning. So you get change management for free.
For the more advanced needs you can also run Databricks activities with Java (Scala) or Python, integrate with Hadoop (Hive and Pig) and Spark.
Finally, ADF incorporates monitoring and diagnostic tools which in SSIS you had to build yourself. You can see much more easily which activity failed and what the error was.
If your ETLs are simple and easy to convert-replace with Data Factory.
If they required complex logic, use SSIS.
In other words, if the transform logic can be implemented by configuration, Data Factory is the best.
If it required writing code and programming skills, SSIS is the right tool.
A few links that may help other people(you most likely made you decision already)
"Azure Data Factory and SSIS compared"
Think of ADF as a complementary service to SSIS, with its main use case confined to inexpensively dealing with big data in the cloud.
Download Azure_Data_Factory_vs_SSIS article from sqlbits
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With