Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python error in Apache NiFi: Import Error: No module named Pandas

I'm new to NiFi. I'm trying to execute a Python script using ExecuteScript processor. When I tried a simple script which has no import commands it ran fine and showed output in nifi.StdOut. When I tried to run a script which includes import commands like import pandas. It showing the below error:

Import Error: No module named Pandas

I tried providing the path of the pkgs in the Module directory in properties. But it doesn't workout. Any help would be appreciated!

like image 599
Vicky Avatar asked Sep 06 '25 03:09

Vicky


1 Answers

I believe the issue is that pandas is a natively-compiled module (it is written in C) rather than being pure Python. The reason this is a problem is that due to the JSR-223 engine, the Apache NiFi ExecuteScript processor uses Jython rather than actual Python. So Python code is fine to run, but it can't depend on modules that aren't pure Python.

The workaround is to use the ExecuteStreamCommand processor to invoke the Python script which depends on pandas via the command-line (i.e. python my_script_that_uses_pandas.py). The flowfile content will be streamed to STDIN and captured from STDOUT. Here's a related answer describing this in detail.

like image 131
Andy Avatar answered Sep 08 '25 11:09

Andy