This is puzzling me. When you run summary() on a vector of integers you don't seem to get accurate results. The numbers seem to be rounded off. I tried this on three different machines with different OS's and the results are the same.
For a vector:
>a <- 0:628846
>str(a)
int [1:628847] 0 1 2 3 4 5 6 7 8 9 ...
>summary(a)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 157200 314400 314400 471600 628800
>max(a)
[1] 628846
For a data.frame:
> b <- data.frame(b = 0:628846)
> str(b)
'data.frame': 628847 obs. of 1 variable:
$ b: int 0 1 2 3 4 5 6 7 8 9 ...
> summary(b)
b
Min. : 0
1st Qu.:157212
Median :314423
Mean :314423
3rd Qu.:471635
Max. :628846
> summary(b$b)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 157200 314400 314400 471600 628800
Why are these results different?
Components of Vectors The analytical method is more accurate than the graphical method, which is limited by the precision of the drawing.
Part of the graphical technique is retained, because vectors are still represented by arrows for easy visualization. However, analytical methods are more concise, accurate, and precise than graphical methods, which are limited by the accuracy with which a drawing can be made.
The analytical method is more accurate instead of the graphical method. The graphical method has chances of being incorrect due to angles of vectors not being drawn accurately, as well as the vector length not being drawn to its desired magnitude or scale.
Advantages of Graphical Methods of Estimation: Graphical methods are quick and easy to use and make visual sense. Calculations can be done with little or no special software needed. Visual test of model (i.e., how well the points line up) is an additional benefit.
The object a
is class integer
, b
is class data.frame
. A data frame
is a list
with certain properties and with class data.frame
(http://cran.r-project.org/doc/manuals/R-intro.html#Data-frames). Many functions, including summary
, handle objects of different classes differently (see that you can use summary
on an object of class lm
and it gives you something completely different). If you want to apply the function summary
to every components in b
, you could use lapply
:
> a <- 0:628846
> b <- data.frame(b = 0:628846)
> class(a)
[1] "integer"
> class(b)
[1] "data.frame"
> names(b)
[1] "b"
> length(b)
[1] 1
> summary(b[[1]]) # b[[1]] gives the first component of the list b
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 157200 314400 314400 471600 628800
> class(b$b)
[1] "integer"
> summary(b$b)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 157200 314400 314400 471600 628800
> lapply(b,summary)
$b
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 157200 314400 314400 471600 628800
>
> # example of summary on a linear model
> x <- rnorm(100)
> y <- x + rnorm(100)
> my.lm <- lm(y~x)
> class(my.lm)
[1] "lm"
> summary(my.lm)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-2.6847 -0.5460 0.1175 0.6610 2.2976
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04122 0.09736 0.423 0.673
x 1.14790 0.09514 12.066 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9735 on 98 degrees of freedom
Multiple R-squared: 0.5977, Adjusted R-squared: 0.5936
F-statistic: 145.6 on 1 and 98 DF, p-value: < 2.2e-16
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With