I'm new to NiFi. I'm trying to execute a Python script using ExecuteScript
processor. When I tried a simple script which has no import commands it ran fine and showed output in nifi.StdOut. When I tried to run a script which includes import commands like import pandas
. It showing the below error:
Import Error: No module named Pandas
I tried providing the path of the pkgs in the Module directory in properties. But it doesn't workout. Any help would be appreciated!
I believe the issue is that pandas is a natively-compiled module (it is written in C) rather than being pure Python. The reason this is a problem is that due to the JSR-223 engine, the Apache NiFi ExecuteScript
processor uses Jython rather than actual Python. So Python code is fine to run, but it can't depend on modules that aren't pure Python.
The workaround is to use the ExecuteStreamCommand
processor to invoke the Python script which depends on pandas via the command-line (i.e. python my_script_that_uses_pandas.py
). The flowfile content will be streamed to STDIN
and captured from STDOUT
. Here's a related answer describing this in detail.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With