Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R's table function in Julia (for DataFrames)

Is there something like R's table function in Julia? I've read about xtab, but do not know how to use it.

Suppose we have R's data.frame rdata which col6 is of the Factor type.

R sample code:

rdata <- read.csv("mycsv.csv") #1 table(rdata$col6) #2

In order to read data and make factors in Julia I do it like this:

using DataFrames jldata = readtable("mycsv.csv", makefactors=true) #1 :col6 will be now pooled.

..., but how to build R's table like in julia (how to achieve #2)?

like image 531
Maciek Leks Avatar asked Jan 07 '16 11:01

Maciek Leks


2 Answers

You can use the countmap function from StatsBase.jl to count the entries of a single variable. General cross tabulation and statistical tests for contingency tables are lacking at this point. As Ismael points out, this has been discussed in the issue tracker for StatsBase.jl.

like image 140
Andreas Noack Avatar answered Oct 04 '22 09:10

Andreas Noack


I came to the conclusion that a similar effect can be achieved using by:

Let jldata consists of :gender column.

julia> by(jldata, :gender, nrow) 3x2 DataFrames.DataFrame | Row | gender | x1 | |-----|----------|-------| | 1 | NA | 175 | | 2 | "female" | 40254 | | 3 | "male" | 58574 |

Of course it's not a table but at least I get the same data type as the datasource. Surprisingly by seems to be faster than countmap.

like image 30
Maciek Leks Avatar answered Oct 04 '22 09:10

Maciek Leks