I would like to know if and how the powers of 10 are related to the printing of scientific notation in the console. I've searched R docs and haven't found anything relevant, or that I really understand. First off, my <code>scipen</code> and <code>digits</code> settings are <pre class="prettyprint"><code>unlist(options("scipen", "digits")) # scipen digits # 0 7 </code></pre> Now, powers of 10 are printed normally up to the 4th power, and then printing switches to scientific notation at the 5th power. <pre class="prettyprint"><code>10^(1:4) # [1] 10 100 1000 10000 10^(1:5) # [1] 1e+01 1e+02 1e+03 1e+04 1e+05 </code></pre> Interestingly, this does not happen for some other numbers larger than 10. <pre class="prettyprint"><code>11^(1:5) # [1] 11 121 1331 14641 161051 </code></pre> Judging from the following, 5 digits seem significant. <pre class="prettyprint"><code>100^(1:2) # [1] 100 10000 100^(1:3) # [1] 1e+02 1e+04 1e+06 </code></pre> So my questions then are: Why is scientific notation activated between the 4th and 5th power for 10 and not for other numbers? Is the number 5 significant? Furthermore, why 5 and not a number closer to the maximum digits option of 22?

Well, the answer is actually there in the definition of <code>scipen</code> in <code>?options</code>, although it's pretty hard to understand what it means without playing around with some examples: <blockquote> ‘scipen’: integer. A penalty to be applied when deciding to print numeric values in fixed or exponential notation. Positive values bias towards fixed and negative towards scientific notation: fixed notation will be preferred unless it is more than ‘scipen’ digits wider. </blockquote> To see what that means, examine the following three pairs of exactly identical numbers. In the first two cases, the width in characters of the fixed notation that is less than or equal to the width of the scientific, so fixed notation is preferred. In the third case, though, the fixed notation is wider (i.e. "more than 0 digits wider"), because the 5 zeros amount to more characters than the 4 characters used to represent the same value using <code>e+nn</code>. As a result, in that case scientific notation is preferred. <pre class="prettyprint"><code>1e+03 1000 # [1] 1000 1e+04 10000 # [1] 10000 1e+05 100000 ## <- wider # [1] 1e+05 </code></pre> Next, examine some numbers that also end with lots of zeros, but whose representation in scientific notation will require use of a <code>.</code>. For these numbers, scientific notation will be used once you have 6 or more zeros (i.e. more than the 5 characters taken up by one <code>.</code> and the characters <code>e+nn</code>). <pre class="prettyprint"><code>1.1e+06 1100000 # [1] 1100000 1.1e+07 11000000 ## <- wider # [1] 1.1e+07 </code></pre> Reasoning about the tradeoff gets a bit trickier for most other numbers, for which the values of both <code>options("scipen")</code> and <code>options("digits")</code> come into play, but the general idea is exactly the same. To see some of the slightly surprising complications that come into play, you might want to paste the following into your console (perhaps after first trying to predict where within each series the switch to scientific notation will occur). <pre class="prettyprint"><code>100001 1000001 10000001 100000001 1000000001 10000000001 100000000001 1000000000001 111111 1111111 11111111 111111111 1111111111 11111111111 111111111111 1111111111111 </code></pre>

Why do powers of 10 print in scientific notation at the 5th power?

Tags:

r

scientific-notation

I would like to know if and how the powers of 10 are related to the printing of scientific notation in the console. I've searched R docs and haven't found anything relevant, or that I really understand.

First off, my scipen and digits settings are

unlist(options("scipen", "digits")) # scipen digits  #      0      7

Now, powers of 10 are printed normally up to the 4th power, and then printing switches to scientific notation at the 5th power.

10^(1:4) # [1]    10   100  1000 10000 10^(1:5) # [1] 1e+01 1e+02 1e+03 1e+04 1e+05

Interestingly, this does not happen for some other numbers larger than 10.

11^(1:5) # [1]     11    121   1331  14641 161051

Judging from the following, 5 digits seem significant.

100^(1:2) # [1]   100 10000 100^(1:3) # [1] 1e+02 1e+04 1e+06

So my questions then are:

Why is scientific notation activated between the 4th and 5th power for 10 and not for other numbers? Is the number 5 significant? Furthermore, why 5 and not a number closer to the maximum digits option of 22?

866

asked Sep 16 '14 02:09

Rich Scriven

2 Answers

Well, the answer is actually there in the definition of scipen in ?options, although it's pretty hard to understand what it means without playing around with some examples:

‘scipen’: integer. A penalty to be applied when deciding to print numeric values in fixed or exponential notation. Positive values bias towards fixed and negative towards scientific notation: fixed notation will be preferred unless it is more than ‘scipen’ digits wider.

To see what that means, examine the following three pairs of exactly identical numbers. In the first two cases, the width in characters of the fixed notation that is less than or equal to the width of the scientific, so fixed notation is preferred.

In the third case, though, the fixed notation is wider (i.e. "more than 0 digits wider"), because the 5 zeros amount to more characters than the 4 characters used to represent the same value using e+nn. As a result, in that case scientific notation is preferred.

1e+03 1000 # [1] 1000  1e+04 10000 # [1] 10000  1e+05 100000      ## <- wider # [1] 1e+05

Next, examine some numbers that also end with lots of zeros, but whose representation in scientific notation will require use of a .. For these numbers, scientific notation will be used once you have 6 or more zeros (i.e. more than the 5 characters taken up by one . and the characters e+nn).

1.1e+06 1100000 # [1] 1100000   1.1e+07 11000000     ##  <- wider # [1] 1.1e+07

Reasoning about the tradeoff gets a bit trickier for most other numbers, for which the values of both options("scipen") and options("digits") come into play, but the general idea is exactly the same.

To see some of the slightly surprising complications that come into play, you might want to paste the following into your console (perhaps after first trying to predict where within each series the switch to scientific notation will occur).

100001 1000001 10000001 100000001 1000000001 10000000001 100000000001 1000000000001  111111 1111111 11111111 111111111 1111111111 11111111111 111111111111 1111111111111

195

answered Sep 20 '22 13:09

Josh O'Brien

I'm confused as to what exactly is your question; or, more specially, how you would use an answer to this question to somehow change/control the behavior of R. You you trying to format numbers a certain way? There are better ways to do that.

When you type values like that, the results are implicitly run though one of the print() commands to be formatted "nicely" to the console. Whenever things have to look "nice" on screen, the code to do that is often ugly. Here most of the that code is taken care of by the formatReal function, and the helper scientific function. The latter tracks the following information for a number

/* for a number x , determine  *  sgn    = 1_{x < 0}  {0/1}  *  kpower = Exponent of 10;  *  nsig   = min(R_print.digits, #{significant digits of alpha})  *  roundingwidens = 1 if rounding causes x to increase in width, 0 otherwise  *  * where  |x| = alpha * 10^kpower   and  1 <= alpha < 10  */

Then the former function uses this information to try to make "nice" looking numbers by balancing values to the left and the right of the decimal place. It's a combination of many things like the order of magnitude of the number and the number of significant digits as well as environmental influences form the scipen option, etc.

print() is only meant to make things look "nice." What exactly is nice depends on all the values in a vector. You'll find few hard cutoffs in that code; it's very adaptive. There is no easy way to concisely describe everything it does in the general case (which is what it sounds like you are asking for).

The only thing that is certain is that if you need to have your numbers formatted in a certain way, use a function like sprintf() or formatC() that allows for precise control.

Of course this behavior is dependent on class() and i've pointed the the formatReal stuff since that's where most tricky things happen. But observe the difference when you use integers

c(10, 100, 1000, 10000, 100000) # [1] 1e+01 1e+02 1e+03 1e+04 1e+05 c(10L, 100L, 1000L, 10000L, 100000L) # [1]     10    100   1000  10000 100000

answered Sep 19 '22 13:09

MrFlick

Related questions
                            
                                Grid in an R plot
                            
                                Interpreting "condition has length > 1" warning from `if` function
                            
                                Conditionally display a block of text in R Markdown
                            
                                Linear mixed model with crossed repeated effects and AR1 covariance structure, in R
                            
                                R hangs when there are too many arguments in setMethod (or setGeneric)
                            
                                Is there something like requirements.txt for R? [closed]
                            
                                How to test graphical output of functions?
                            
                                How to get started with Big Data Analysis [closed]
                            
                                Update a specific R package and its dependencies
                            
                                How to Change .libPaths() permanently in R?
                            
                                R 3.4.1 "Single Candle" Personal Library Path Error: unable to create ‘NA’
                            
                                R: is there something like iPython notebook (jupyter) for R? [closed]
                            
                                In R markdown in RStudio, how can I prevent the source code from running off a pdf page?
                            
                                lapply with "$" function
                            
                                What's the difference in using a semicolon or explicit new line in R code
                            
                                Difference between c() and append()
                            
                                Add link to R Shiny Application so link opens in a new browser tab
                            
                                Rbuildignore and Excluding Directories
                            
                                Complete remove and reinstall R, including all packages
                            
                                Replace single backslash in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With