Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there Pandas DataFrame equivalent in Java? [duplicate]

Tags:

java

dataframe

r

I really like data.frames in R because you can store different types of data in one data structure and you have a lot of different methods to modify the data (add column, combine data.frames,...), it is really easy to extract a subset from the data,...

Is there any Java library available which have the same functionality? I'm mostly interested in storing different types of data in a matrix-like fashion and be able to extract a subset of the data.

Using a two-dimensional array in Java can provide a similar structure, but it is much more difficult to add a column and afterwards extract the top k records.

like image 859
Michael Avatar asked Dec 12 '13 10:12

Michael


Video Answer


3 Answers

Tablesaw (https://github.com/jtablesaw/tablesaw) is Java dataframe begun in 2015 and is under active development (2018). It's designed to be as scalable as possible without sacrificing ease-of-use. Features include filtering by rows and columns, descriptive stats, map/reduce functions, cross-tabs, plots, machine learning. Apache license.

In one query test it returned 500+ records from a 1/2 billion record table in 2 ms.

Contributions, feature requests, and feedback are welcome.

like image 163
L. Blanc Avatar answered Oct 05 '22 22:10

L. Blanc


I have just open-sourced a first draft version of Paleo, a Java 8 library which offers data frames based on typed columns (including support for primitive values). Columns can be created programmatically (through a simple builder API), or imported from text file.

Please refer to the README for further details.

The project is still wet from birth – I am very interested in feedback / PRs, tia!

like image 21
Rahel Lüthy Avatar answered Oct 06 '22 00:10

Rahel Lüthy


I also found myself in need of a data frame structure while working in Java recently. Fortunately, after writing a very basic implementation I was able to get approval to release it as open source. You can find my implementation here: Joinery -- Data frames for Java. Contributions and feature requests are welcome.

like image 43
Bryan Cardillo Avatar answered Oct 06 '22 00:10

Bryan Cardillo