Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a data frame from R output

Tags:

r

I'm trying to create a dataset from the output of multiple operations. But I don't know how to automate this. The replicate function would be nice but there's multiple operations to perform to get the single new data points i.e., adjusted R squared & F-statistic.

R code:

#make dataframe with random data
A<-as.integer(round(runif(20, min=1, max=10)))
dim(A) <- c(10,2)
A<-as.data.frame(A)
#extract F-statistic
summary(lm(formula=V1~V2,data=A))$fstatistic[1]
#extract adjusted R squared
summary(lm(formula=V1~V2,data=A))$adj.r.squared
#repeat 100 times and make a dataframe of the unique extracted output, e.g. 2 columns 100 rows
??????????????
like image 746
Scuba Steve Avatar asked Nov 22 '25 08:11

Scuba Steve


1 Answers

Applying the linear model over 5 data frames...

With replicate, it would be something like

> replicate(5, {
      A <- data.frame(rnorm(5), rexp(5))
      m <- lm(formula = A[,1] ~ A[,2], data = A)
      c(f = summary(m)$fstatistic[1], adjR = summary(m)$adj.r.squared)
  })
##               [,1]      [,2]       [,3]      [,4]        [,5]
## f.value  0.4337426 1.3524681 1.17570087 3.8537837  0.04583862
## adjR    -0.1649097 0.0809812 0.04207698 0.4163808 -0.31326721

And you can wrap this with t() to get the long form matrix.

You could also use the ever-popular do.call(rbind, lapply(...)) method,

> do.call(rbind, lapply(seq(5), function(x){
      A <- data.frame(rnorm(5), rexp(5))
      m <- lm(formula = A[,1] ~ A[,2], data = A)
      c(f = summary(m)$fstatistic[1], adjR = summary(m)$adj.r.squared)
  }))
##          f.value        adjR
## [1,]   1.9820243  0.19711351
## [2,]  21.6698543  0.83785879
## [3,]   4.4484639  0.46297652
## [4,]   0.9084373 -0.02342693
## [5,]   0.0388510 -0.31628698

You can also use sapply,

> sapply(seq(5), function(x){
      A <- data.frame(rnorm(5), rexp(5))
      m <- lm(formula = A[,1] ~ A[,2], data = A)
      c(f = summary(m)$fstatistic[1], adjR = summary(m)$adj.r.squared)
  })
##                    [,1]       [,2]          [,3]       [,4]        [,5]
## f.value      0.07245221  0.2076504  0.0003488657 58.5524139  0.92170453
## adjR        -0.30189169 -0.2470187 -0.3331783000  0.9350147 -0.01996465

Keep in mind that these all return a matrix, so an as.data.frame wrapper may be appropriate if you would like a data.frame result.

like image 177
Rich Scriven Avatar answered Nov 24 '25 23:11

Rich Scriven



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!