Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert numbers to SI prefix

Is there a R function (or any package) allowing to format numbers (integer) using standard unit prefix (Kilo, Mega etc ...), so

10 -> 10
1000 -> 1K
0.01 - > 10m

etc ... I can do it myself but I would prefer to not reinvent the wheel.

like image 700
mb14 Avatar asked Jul 05 '12 08:07

mb14


3 Answers

require(sitools)
f2si(80000)
 [1] "80 k"
f2si(8E12)
 [1] "8 T"

It seems to be very simplistic as it appends two spaces if no SI prefix is used:

f2si(80)
[1] "80  "

The function is easy to modify to include rounding. I also fixed the issue with appended spaces.

f2si2<-function (number,rounding=F) 
{
    lut <- c(1e-24, 1e-21, 1e-18, 1e-15, 1e-12, 1e-09, 1e-06, 
        0.001, 1, 1000, 1e+06, 1e+09, 1e+12, 1e+15, 1e+18, 1e+21, 
        1e+24)
    pre <- c("y", "z", "a", "f", "p", "n", "u", "m", "", "k", 
        "M", "G", "T", "P", "E", "Z", "Y")
    ix <- findInterval(number, lut)
    if (lut[ix]!=1) {
        if (rounding==T) {
         sistring <- paste(round(number/lut[ix]), pre[ix])
        }
        else {
         sistring <- paste(number/lut[ix], pre[ix])
        } 
    }
    else {
        sistring <- as.character(number)
    }
    return(sistring)
}

f2si2(12345)
 [1] "12.345 k"
f2si2(12345,T)
 [1] "12 k"
like image 123
Roland Avatar answered Nov 15 '22 17:11

Roland


I came here with the same question. Thanks to Roland for his answer; I built on his code with a few changes:

  • Allows significant figures to be specified when rounding=FALSE (defaults to 6 just like the 'signif' builtin function)
  • Doesn't throw an error with values below 1e-24
  • Outputs scientific notation (no units) for values above 1e27

Hope this is helpful.

f2si<-function (number, rounding=F, digits=ifelse(rounding, NA, 6)) 
{
    lut <- c(1e-24, 1e-21, 1e-18, 1e-15, 1e-12, 1e-09, 1e-06, 
        0.001, 1, 1000, 1e+06, 1e+09, 1e+12, 1e+15, 1e+18, 1e+21, 
        1e+24, 1e+27)
    pre <- c("y", "z", "a", "f", "p", "n", "u", "m", "", "k", 
        "M", "G", "T", "P", "E", "Z", "Y", NA)
    ix <- findInterval(number, lut)
    if (ix>0 && ix<length(lut) && lut[ix]!=1) {
        if (rounding==T && !is.numeric(digits)) {
            sistring <- paste(round(number/lut[ix]), pre[ix])
        }
        else if (rounding == T || is.numeric(digits)) {
            sistring <- paste(signif(number/lut[ix], digits), pre[ix])
        }
        else {
            sistring <- paste(number/lut[ix], pre[ix])
        } 
    }
    else {
        sistring <- as.character(number)
    }
    return(sistring)
}

f2si(12345)
 [1] "12.345 k"
f2si(12345, T)
 [1] "12 k"
f2si(10^31)
 [1] "1e+31" # (previous version would output "1e+07 Y"
f2si(10^-25)
 [1] "1e-25" # (previous version would throw error)
f2si(123456789)
 [1] "123.457 M" # (previous version would output ""123.456789 M"
f2si(123456789, digits=4)
 [1] "123.5 M" # (note .456 is rounded up to .5)

From this code it's pretty easy to write a similar function for commonly used financial units (K, MM, Bn, Tr), too.

like image 23
tomelgin Avatar answered Nov 15 '22 18:11

tomelgin


This is simple to vectorise using case_when from dplyr, and it's much easier on the eyes:

library(dplyr)

si_number = function(x, digits) {
    
    compress = function(x, n) {
        signif(x * 10^(-n), digits)
    }
    
    case_when(
        x >= 1e6   ~ paste0(compress(x, 6), "M"),
        x >= 1000  ~ paste0(compress(x, 3), "k"),
        x >= 1     ~ as.character(compress(x, 0)),
        x >= 0.001 ~ paste0(compress(x, -3), "m"),
        x >= 1e-6  ~ paste0(compress(x, -6), "u")
    )
}

like image 3
Tarquinnn Avatar answered Nov 15 '22 16:11

Tarquinnn