Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you execute a python Wheel Class/Method(not a script) in Azure Data Factory using an Azure Databricks activity?

Is it possible to execute a python Wheel Class/Method(not a script) in Azure Data Factory using an Azure Databricks activity like you would execute if it were a java packaged method in a .jar? Unlike a script, this would have the ability to return a value(s), without doing something like burying them stdout.

I haven't been able to search anything and I tried using the jar activity with no luck which didn't surprise me but worth a try.

If not, what I am looking for is a way to use Azure Databricks compute and return a small set of values back from the python job. I have successfully used the ADF activity for databricks python script.

TIA!

like image 327
TheOriginOf3 Avatar asked Apr 20 '26 06:04

TheOriginOf3


1 Answers

Yes. Add the wheel as a library on the cluster. Then create a .py file that imports the library and calls the method you need. Save the py file onto the dbfs volume.

Create a data factory pipeline that uses the python task and point it at your py file. You can pass in arguments as well.

You could also do this with a notebook that imports the library.

This blog post (and the series it is in) should help https://datathirst.net/blog/2019/9/20/building-pyspark-applications-as-a-wheel

like image 123
simon_dmorias Avatar answered Apr 21 '26 21:04

simon_dmorias