Crosstabs with data.table in R [duplicate]

Tags:

I love the data.table package in R, and I think it could help me perform sophisticated cross tabulation tasks, but haven't figured out how to use the package to do tasks similar to table.

Here's some replication survey data:

opinion <- c("gov", "market", "gov", "gov")
ID <- c("resp1", "resp2", "resp3", "resp4")
party <- c("GOP", "GOP", "democrat", "GOP")

df <- data.frame(ID, opinion, party)

In tables, counting the number of opinions by party is as simple as table(df$opinion, df$party).

I've managed to do something similar in data.table, but the result is clunky and it adds a separate column.

dt <- data.table(df)
dt[, .N, by="party"]

There's a number of grouping operations in data.table that could be great for fast and sophisticated crosstabs of survey data, but i haven't found any tutorials on how to it. Thanks for any help.

994

asked Oct 04 '15 15:10

tom

1 Answers

We can use dcast from data.table (See the Efficient reshaping using data.tables vignette on the project wiki or on the CRAN project page).

dcast(dt, opinion~party, value.var='ID', length)

Benchmarks

If we use a slightly bigger dataset and compare the speed using dcast from reshape2 and data.table

set.seed(24)
df <- data.frame(ID=1:1e6, opinion=sample(letters, 1e6, replace=TRUE),
  party= sample(1:9, 1e6, replace=TRUE))
system.time(dcast(df, opinion ~ party, value.var='ID', length))
#   user  system elapsed 
#  0.278   0.013   0.293 
system.time(dcast(setDT(df), opinion ~ party, value.var='ID', length))
#   user  system elapsed 
# 0.022   0.000   0.023 

system.time(setDT(df)[, .N, by = .(opinion, party)])
#  user  system elapsed 
# 0.018   0.001   0.018

The third option is slightly better but it is in 'long' format. If the OP wants to have a 'wide' format, the data.table dcast can be used.

NOTE: I am using the the devel version i.e. v1.9.7, but the CRAN should be fast enough.

166

answered Oct 22 '22 23:10

akrun

Related questions
                            
                                Saving plots in R as GIFs
                            
                                Get last top level command as a character string
                            
                                r shiny table not rendering html
                            
                                Finetuning a forest plot with ggplot2
                            
                                Lasso error in glmnet NA/NaN/Inf
                            
                                Conditional (inequality) join in data.table
                            
                                How to perform approximate (fuzzy) name matching in R
                            
                                How to show error line number in R studio
                            
                                ODBC works fine in MS Excel, but not in R
                            
                                R Shiny Make slider value dynamic
                            
                                Write .ods openoffice / libreoffice files in R
                            
                                how do I make install.packages return an error if an R package cannot be installed?
                            
                                Real time stock price R [closed]
                            
                                Geocode batch addresses in R with open mapquestapi
                            
                                R Data.Table Join on Conditionals
                            
                                Dynamically formatting individual axis labels in ggplot2
                            
                                Name list elements based on variable names R
                            
                                How to use eqnarray in R markdown for both html and pdf output?
                            
                                Installation of R-package "BH" not possible
                            
                                Read csv file in R with double quotes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Crosstabs with data.table in R [duplicate]

Tags:

r

data.table

crosstab

tom

People also ask

1 Answers

Benchmarks

akrun

Recent Activity

Donate For Us