Is it possible to execute a python Wheel Class/Method(not a script) in Azure Data Factory using an Azure Databricks activity like you would execute if it were a java packaged method in a .jar? Unlike a script, this would have the ability to return a value(s), without doing something like burying them stdout.
I haven't been able to search anything and I tried using the jar activity with no luck which didn't surprise me but worth a try.
If not, what I am looking for is a way to use Azure Databricks compute and return a small set of values back from the python job. I have successfully used the ADF activity for databricks python script.
TIA!
Yes. Add the wheel as a library on the cluster. Then create a .py file that imports the library and calls the method you need. Save the py file onto the dbfs volume.
Create a data factory pipeline that uses the python task and point it at your py file. You can pass in arguments as well.
You could also do this with a notebook that imports the library.
This blog post (and the series it is in) should help https://datathirst.net/blog/2019/9/20/building-pyspark-applications-as-a-wheel
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With