I've been using Pentaho Kettle for quite a while and previously the transformations and jobs i've made (using spoon) have been quite simple load from db, rename etc, input to stuff to another db. But now i've been doing transformations that do a bit more complex calculations that i would now like to test somehow.
So what i would like to do is:
One option would probably be to make a Kettle test job that would test the transformation. But as my transformations relate to a java project i would prefer to run the tests from jUnit. So i've considered making a jUnit test that would:
This approach would however require test database(s) which are not always available (oracle etc. expensive/legacy db's) What I would prefer is that if I could mock or pass some stub test data to my input steps some how.
Any other ideas on how to test Pentaho kettle transformations?
there is a jira somewhere on jira.pentaho.com ( i dont have it to hand ) that requests exactly this - but alas it is not yet implemented.
So you do have the right solution in mind- I'd also add jenkins and an ant script to tie it all together. I've done a similar thing with report testing - I actually had a pentaho job load the data, then it executed the report, then it compared the output with known output and reported pass/failure.
If you separate out your kettle jobs into two phases:
You can use copy rows to result at the end of your load data to stream step, and get rows from result to get rows at the start of your process step.
If you do this, then you can use any means to load data (kettle transform, dbunit called from ant script), and can mock up any database tables you want.
I use this for testing some ETL scripts I've written and it works just fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With