Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I import Pandas with Jython

I'm new to python, and I've install Jython2.7.0

Java

import org.python.util.PythonInterpreter;
import org.python.core.*; 

public class Main {
    public static void main(String[] args) {
         PythonInterpreter interp = new PythonInterpreter(); 
         interp.execfile("D:/Users/JY/Desktop/test/for_java_test.py");  
         interp.close();
    }
}

Python

import pandas as pd
import ctypes

def main():
    data = pd.read_csv('for_test.csv')
    data_mean = data.a*2
    data_mean.to_csv('catch_test.csv',index=False)
    ctypes.windll.user32.MessageBoxW(0, "Done. Output: a * 2", "Output csv", 0)

if __name__ == '__main__':
    main()

Then I got this error.

Exception in thread "main" Traceback (most recent call last):
File "D:\Users\JYJU\Desktop\test_java\for_java_test.py", line 1, in <module>
    import pandas as pd
ImportError: No module named pandas

How can I fix this if I want to use pandas?

like image 882
Jimmy Chu Avatar asked Mar 25 '16 04:03

Jimmy Chu


People also ask

How do I import pandas in Python?

Enter the command “pip install pandas” on the terminal. This should launch the pip installer. The required files will be downloaded, and Pandas will be ready to run on your computer. After the installation is complete, you will be able to use Pandas in your Python programs.

What is the best way to import pandas?

There are various ways to install the Python Pandas module. One of the easiest ways is to install using Python package installer i.e. PIP. In order to add the Pandas and NumPy module to your code, we need to import these modules in our code.

Can I use NumPy in Jython?

JyNI is a compatibility layer with the goal to enable Jython to use native CPython extensions like NumPy or SciPy. This way we aim to enable scientific Python code to run on Jython.


2 Answers

You currently cannot use Pandas with Jython, because it depends on CPython specific native extensions. One dependency is NumPy, the other is Cython (which is actually not a native CPython extension, but generates such).

Keep an eye on the JyNI project ("Jython Native Interface"). It enables Jython to use native CPython-extensions and its exact purpose is to solve issues like that encountered by you. However, it is still under heavy development and not yet capable of loading Pandas or NumPy into Jython, but both frameworks are high on the priority list.

(E.g. ctypes is already working to some extend.)

Also, it is currently POSIX only (tested on Linux and OSX).

If you wouldn't require Jython specifically, but just some Java/Pandas interoperation, an already workable solution would be to embed the CPython interpreter. JPY and JEP are projects that provide this. With either of them you should be able to interoperate Java and Pandas (or any other CPython-specific framework).

like image 170
stewori Avatar answered Sep 21 '22 14:09

stewori


As far as I know pandas is written in cython and is a CPython extension. This means that it's meant to be used by CPython implementation of the Python language (which is the primary implemntation most people use).

Jython is a Python implementation to run Python programs on JVM and is used to provide integration with Java libraries, or Python scripting to Java programs, etc.

Python modules implemented as CPython extensions (like pandas) are not necessarily compatible with all Python implementations (famous implementations other than CPython are Jython, PyPy and IronPython)

If you really have to use Jython and pandas together and you could not find another way to solve the issue, then I suggest using them in different processes.

A Java process is your Jython application running on JVM (either is Java code calling Jython libraries, or a Python code that possibly requires integration with some Java libraries), and another CPython process runs to provide operations required from pandas.

Then use some form of IPC (or tool) to communicate (standard IO, sockets, OS pipes, shared memory, memcache, Redis, etc.).

The Java process sends a request to CPython (or registers the request to shared storage), providing processing parameters, CPython process uses pandas to calculate results and sends back a serialized form of the results (or puts the results back on the shared storage).

This approach requires extra coding (due to splitting the tasks into separate processes), and to serialize the request/response (which depends on the application and the data it's trying to process).

For example in this sample code on the question, Java process can provide the CSV filename to CPython, CPython processes the CSV file using pandas, generates the result CSV file and returns the name of the new file to Java process.

like image 39
farzad Avatar answered Sep 19 '22 14:09

farzad