Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Alteryx an ETL tool? How it differs from SSIS? [closed]

My client want me to implement ETL process using Alteryx as they have a license of it. I am confused whether the Alteryx is an ETL tool or not. I believe that Alteryx is commonly used to prepare data for Tableau data visualization tool.

Please advise whether its an ETL tool or not? How it differs from SSIS?

Thanks,

like image 257
Harry Avatar asked Dec 05 '22 21:12

Harry


2 Answers

Alteryx is a data preparation / advanced anaytics application. People use it in many different ways due to the fact it allows data preparation, spatial analytics and predictive.

I work with many clients who choose to use Alteryx purely for its ETL capabilities moving data from one database to another, e.g. I have worked with one client who has used Alteryx to automate their loads into their Amazon Redshift database from MySQL, another who is using SQL -> Tableau data engine, and many other examples involving a range of data inputs (Alteryx supports everything from custom APIs -> Excel).

If you're already working with SSIS then you'll find Alteryx a breathe of fresh air to be honest, I was working with SSIS in a past life and have since found Alteryx to be much faster to develop with. It is more forgiving to changes to data and allows tighter integration of many different data sources. The new in-database tools give a much tighter integration with SQL as was previously possible allowing the work to be done inside the database.

Finally, compared to SSIS, I think you'll find Alteryx very simple to learn. The online training videos on their site will give you as much introduction as you need.

Enjoy, I think you'll enjoy the experience.

Chris

like image 112
Chris Love Avatar answered Dec 30 '22 13:12

Chris Love


Alteryx can be used for ETL as long as you have an Alteryx Server. I've used it for a number of use-cases especially between cloud & database.

Some things that in my personal opinion make it clearly superior to SSIS:

  • If input has column names (from database or from csv file with headers), it handles unexpected new columns or column order changes automatically, without requiring you to change the flows at all.
  • You can build flows as "macros" which you can then unit test completely independently of your source/destination databases (try that in SSIS..)
  • Ability to drop a browse tool anywhere in the flow and effectively debug.
  • Build in assertions using "Test" tools.

Flows are runnable from the commandline on a server, and easiest way I've found (besides using Alteryx's own scheduler) is to save as an "App", and then run from the command line using the Alteryx engine executable, passing it parameters via xml file. You can save a sample xml parameter file from your flow by hitting the magic wand button (after saving the flow as a .yxwz (app)) This brings up a panel that lets you set the variables, and that panel has a handy "save" button which generates an xml file in the right format.

Within the flows themselves, parameterise things like environment settings either via action tools or module level parameters (User.*) - you can then for example set a database server on an input using %User.[Your variable name]% in the field.

Error logs are generally excellent (identify the tool that failed, useful error messages), and command line throws useful errorlevel numbers, so pretty trivial to schedule with some third party scheduler (or just use the Alteryx Server's own scheduler).

Obviously if you need to do any serious data manipulation, pivoting etc, then it's hands down the easiest tool I've used.

like image 23
Runonthespot Avatar answered Dec 30 '22 14:12

Runonthespot