Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to return only the degrees of freedom from a summary of a regression in r?

I would like to return only the df (degrees of freedom) out of the summary.I searched thru Internet but I did not find anything for this.

    y=c(2,13,0.4,5,8,10,13)
    y1=c(2,13,0.004,5,8,1,13)
    y2=c(2,3,0.004,15,8,10,1)
    y3=c(2,2,2,2,2,2,NA)
    fit=lm(y~y1+y2+y3)
    summary(fit)
     Call:
 lm(formula = y ~ y1 + y2 + y3)

 Residuals:
  1       2       3       4       5       6 
-1.5573  1.6523 -1.3718 -3.2909 -0.9247  5.4924 

 Coefficients: (1 not defined because of singularities)
        Estimate Std. Error t value Pr(>|t|)
 (Intercept)   1.7682     3.0784   0.574    0.606
y1            0.6896     0.3649   1.890    0.155 
 y2            0.2050     0.3184   0.644    0.566
 y3                NA         NA      NA       NA

  Residual standard error: 4.037 on 3 degrees of freedom
   (1 observation deleted due to missingness)
 Multiple R-squared:   0.58,    Adjusted R-squared:    0.3 
  F-statistic: 2.071 on 2 and 3 DF,  p-value: 0.2722

is there any function that only return the df example

         df(fit) or fit$df
         3
like image 650
sacvf Avatar asked Feb 12 '14 16:02

sacvf


Video Answer


2 Answers

In the comments, the OP mentions they are using lm.fit() not lm() hence the example code to demonstrate how to do this is quite different; lm.fit() needs the vector response and the correct model matrix to be supplied by the user, lm() does all that for you. Hence the presence of NA in x3 is a problem we need to account for, anyway, df.residual() works for that example too:

Xy <- cbind(y  = c(2,13,0.4,5,8,10,13),
           x0 = rep(1, 7),
           x1 = c(2,13,0.004,5,8,1,13),
           x2 = c(2,3,0.004,15,8,10,1),
           x3 = c(2,2,2,2,2,2,NA))
Xy <- Xy[complete.cases(Xy), ]
X <- Xy[, -1]
y <- Xy[,  1]

fit <- lm.fit(X, y)

R> df.residual(fit)
[1] 3

Inspect the fitted object fit

Xy <- data.frame(y = c(2,13,0.4,5,8,10,13),
                 x1 = c(2,13,0.004,5,8,1,13),
                 x2 = c(2,3,0.004,15,8,10,1),
                 x3 = c(2,2,2,2,2,2,NA))
fit <- lm(y ~ x1 + x2 + x3, data = Xy)

str(fit, max = 1)

R> str(fit, max = 1)
List of 13
 $ coefficients : Named num [1:4] 1.768 0.69 0.205 NA
  ..- attr(*, "names")= chr [1:4] "(Intercept)" "x1" "x2" "x3"
 $ residuals    : Named num [1:6] -1.557 1.652 -1.372 -3.291 -0.925 ...
  ..- attr(*, "names")= chr [1:6] "1" "2" "3" "4" ...
 $ effects      : Named num [1:6] -15.68 -7.79 2.6 -3.22 -0.98 ...
  ..- attr(*, "names")= chr [1:6] "(Intercept)" "x1" "x2" "" ...
 $ rank         : int 3
 $ fitted.values: Named num [1:6] 3.56 11.35 1.77 8.29 8.92 ...
  ..- attr(*, "names")= chr [1:6] "1" "2" "3" "4" ...
 $ assign       : int [1:4] 0 1 2 3
 $ qr           :List of 5
  ..- attr(*, "class")= chr "qr"
 $ df.residual  : int 3
 $ na.action    :Class 'omit'  Named int 7
  .. ..- attr(*, "names")= chr "7"
 $ xlevels      : Named list()
 $ call         : language lm(formula = y ~ x1 + x2 + x3, data = Xy)
 $ terms        :Classes 'terms', 'formula' length 3 y ~ x1 + x2 + x3
     .... <removed>
 $ model        :'data.frame':  6 obs. of  4 variables:
     .... <removed>
 - attr(*, "class")= chr "lm"

There you'll note the df.residual component. You could extract is as you would any other object from a list

R> fit$df.residual
[1] 3

but that would be to miss the extractor function df.residual(), which does it all for you

R> df.residual(fit)
[1] 3

The nice thing about this is that should a function-writer care, they could include a method for df.residual() in their package so this works for their class of models too, whilst you only have to remember a single function name...

like image 174
Gavin Simpson Avatar answered Nov 03 '22 00:11

Gavin Simpson


It is exactly what you suggest in your question

y=runif(20)
x=runif(20)
lm(y~x)$df

> lm(y~x)$df
[1] 18
like image 27
Hans Roggeman Avatar answered Nov 03 '22 00:11

Hans Roggeman