Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does summary() give a different maximum to max()

Tags:

r

using R-2.15.2 on Windows XP I get a different maximum from summary() than from max(). Why is that so?

Here is the relevant code:

> class(dat)
[1] "data.frame"
> dim(dat)
[1] 3850   54
> summary(dat$enrol)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    26     945    1744    3044    3128  183200 
> max(dat$enrol)
[1] 183151

Any ideas why summary() rounds the result up?

Best Oliver

like image 360
Florian Mans Avatar asked Jan 26 '13 12:01

Florian Mans


1 Answers

It is how the results are printed respecting the digits argument. The default is

> max(3, getOption("digits")-3)
[1] 4

Why R rounds up is just the default rules in use - go to the nearest even digit. We can see this in action with signif():

> signif(183151, digits = 4)
[1] 183200

which, as ?summary tells us, is what is used by summary() and is controlled by the digits argument:

digits: integer, used for number formatting with ‘signif()’ (for
        ‘summary.default’) or ‘format()’ (for ‘summary.data.frame’).

Read ?signif for more on the rounding issue.

To get more significant digits, pass a higher number to summary() via the digits argument.

For example

> set.seed(1)
> vec <- c(10, 100, 1e4, 1e5, 1e6) + runif(5)
> summary(vec)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
     10.3     100.4   10000.0  222000.0  100000.0 1000000.0 
> summary(vec, digits = 7)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
     10.3     100.4   10000.6  222022.5  100000.9 1000000.0 
> summary(vec, digits = 8)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
     10.3     100.4   10000.6  222022.5  100000.9 1000000.2 
like image 118
Gavin Simpson Avatar answered Oct 04 '22 16:10

Gavin Simpson