How can I fit multiple models by group using data.table syntax? I want my output to be a data.frame with columns for each "by group" and one column for each model fit. Currently I am able to do this using the dplyr package, but can't do this in data.table. <pre class="prettyprint"><code># example data frame df <- data.table( id = sample(c("id01", "id02", "id03"), N, TRUE), v1 = sample(5, N, TRUE), v2 = sample(round(runif(100, max = 100), 4), N, TRUE) ) # equivalent code in dplyr group_by(df, id) %>% do( model1= lm(v1 ~v2, .), model2= lm(v2 ~v1, .) ) # attempt in data.table df[, .(model1 = lm(v1~v2, .SD), model2 = lm(v2~v1, .SD) ), by = id ] # Brodie G's solution df[, .(model1 = list(lm(v1~v2, .SD)), model2 = list(lm(v2~v1, .SD))), by = id ] </code></pre>

Try: <pre class="prettyprint"><code>df[, .(model1 = list(lm(v1~v2, .SD)), model2 = list(lm(v2~v1, .SD))), by = id ] </code></pre> or slightly more idiomatically: <pre class="prettyprint"><code>formulas <- list(v1~v2, v2~v1) df[, lapply(formulas, function(x) list(lm(x, data=.SD))), by=id] </code></pre>

Fit model by group using Data.Table package

Tags:

r

data.table

How can I fit multiple models by group using data.table syntax? I want my output to be a data.frame with columns for each "by group" and one column for each model fit. Currently I am able to do this using the dplyr package, but can't do this in data.table.

# example data frame
df <- data.table(
   id = sample(c("id01", "id02", "id03"), N, TRUE),     
   v1 = sample(5, N, TRUE),                          
   v2 = sample(round(runif(100, max = 100), 4), N, TRUE) 
)

# equivalent code in dplyr
group_by(df, id) %>%
do( model1= lm(v1 ~v2, .),
    model2= lm(v2 ~v1, .)
  )

# attempt in data.table
df[, .(model1 = lm(v1~v2, .SD), model2 = lm(v2~v1, .SD) ), by = id ]

# Brodie G's solution
df[, .(model1 = list(lm(v1~v2, .SD)), model2 = list(lm(v2~v1, .SD))), by = id ]

947

asked Apr 02 '15 19:04

k13

1 Answers

Try:

df[, .(model1 = list(lm(v1~v2, .SD)), model2 = list(lm(v2~v1, .SD))), by = id ]

or slightly more idiomatically:

formulas <- list(v1~v2, v2~v1)
df[, lapply(formulas, function(x) list(lm(x, data=.SD))), by=id]

answered Oct 17 '22 20:10

BrodieG

Related questions
                            
                                NaN is removed when using na.rm=TRUE
                            
                                Align edges of ggplot choropleth (legend title varies)
                            
                                rapply to nested list of data frames in R
                            
                                prevent knitr/Rmarkdown from interleaving chunk output with code
                            
                                `geom_line()` connects points mapped to different groups
                            
                                Adding a counter column for a set of similar rows in R [duplicate]
                            
                                Adding principal components as variables to a data frame
                            
                                R :Plot and save in a pdf file
                            
                                GGally - unexpected behavior with ggpairs(..., diag = list( continuous = 'density'))
                            
                                How do I reinstall a base-R package (e.g., stats, graphics, utils, etc.)?
                            
                                fread() fails with missing values in integer64 columns
                            
                                splice in a bquote in R
                            
                                Replace entire strings based on partial match
                            
                                I can't generate \label{fig:mwe-plot} with knitr
                            
                                Dodging points and error bars with ggplot
                            
                                How to end a header 3 box in rmarkdown beamer madrid presentation?
                            
                                NA in clustering functions (kmeans, pam, clara). How to associate clusters to original data?
                            
                                R: ggvis - gray background (as ggplot2)
                            
                                ggplot2 boxplot medians aren't plotting as expected
                            
                                create an empty list to fill it up with lists in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With