Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java object analogue to R data.frame [closed]

Tags:

java

dataframe

r

I really like data.frames in R because you can store different types of data in one data structure and you have a lot of different methods to modify the data (add column, combine data.frames,...), it is really easy to extract a subset from the data,...

Is there any Java library available which have the same functionality? I'm mostly interested in storing different types of data in a matrix-like fashion and be able to extract a subset of the data.

Using a two-dimensional array in Java can provide a similar structure, but it is much more difficult to add a column and afterwards extract the top k records.

like image 964
Michael Avatar asked Dec 12 '13 10:12

Michael


2 Answers

Tablesaw (https://github.com/jtablesaw/tablesaw) is Java dataframe begun in 2015 and is under active development (2018). It's designed to be as scalable as possible without sacrificing ease-of-use. Features include filtering by rows and columns, descriptive stats, map/reduce functions, cross-tabs, plots, machine learning. Apache license.

In one query test it returned 500+ records from a 1/2 billion record table in 2 ms.

Contributions, feature requests, and feedback are welcome.

like image 121
L. Blanc Avatar answered Oct 14 '22 14:10

L. Blanc


I have just open-sourced a first draft version of Paleo, a Java 8 library which offers data frames based on typed columns (including support for primitive values). Columns can be created programmatically (through a simple builder API), or imported from text file.

Please refer to the README for further details.

The project is still wet from birth – I am very interested in feedback / PRs, tia!

like image 24
Rahel Lüthy Avatar answered Oct 14 '22 16:10

Rahel Lüthy