Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How would you write a wrapper function or class to format numbers as percent, currency, etc. in R?

In a previous question, I asked whether whether a convenient wrapper exists inside base R to format numbers as percentages.

This elicited three responses:

  1. Probably not.
  2. Such a wrapper would be too narrow to be useful. It is better that useRs learn how to use existing tools, such as sprintf, which can format numbers in a highly flexible way.
  3. Such a wrapper is problematic, anyway, since you lose the ability to perform calculations on the object.

Still, in my view the sprintf function is just a little bit too obfuscated for the R beginner to learn (except if they come from a C background). Perhaps a better solution is to modify format or prettyNum to have options for adding prefixes and suffixes, so you could easily create percents, currencies, degrees, etc.


Question:

How would you design a function, class or set of functions to elegantly deal with formatting numbers as percentages, currencies, degrees, etc?

like image 240
Andrie Avatar asked Aug 22 '11 12:08

Andrie


3 Answers

I would probably keep things very simple. format() is generally useful for most basic formatting needs. I would extend that with a simple wrapper that allowed arbitrary prefix and suffix strings. Here is a simple version:

formatVal <- function(x, prefix = "", suffix = "", sep = "", collapse = NULL,
                      ...) {
    x <- format(x, ...)
    x <- paste(prefix, x, suffix, sep = sep, collapse = collapse)
    x
}

If I were doing this for real, I would probably not have the collapse argument in the definition of formatVal(), but instead process it out of ..., but for illustration I kept the above function simple.

Using:

set.seed(1)
m <- runif(5)

some simple examples of usage

> formatVal(m*100, suffix = "%")
[1] "26.55087%" "37.21239%" "57.28534%" "90.82078%" "20.16819%"
> formatVal(m*100, suffix = "%", digits = 2)
[1] "27%" "37%" "57%" "91%" "20%"
> formatVal(m*100, suffix = "%", digits = 2, nsmall = 2)
[1] "26.55%" "37.21%" "57.29%" "90.82%" "20.17%"
> formatVal(m, prefix = "£")
[1] "£0.2655087" "£0.3721239" "£0.5728534" "£0.9082078" "£0.2016819"
> formatVal(m, prefix = "£", digits = 1)
[1] "£0.3" "£0.4" "£0.6" "£0.9" "£0.2"
> formatVal(m, prefix = "£", digits = 1, nsmall = 2)
[1] "£0.27" "£0.37" "£0.57" "£0.91" "£0.20"
like image 117
Gavin Simpson Avatar answered Oct 15 '22 16:10

Gavin Simpson


print.formatted <- function(x)
{
   print(paste(attr(x,"prefix"), sprintf(x*attr(x,"scaleFactor"),fmt=paste("%.",attr(x,"precision"),"f",sep="")), attr(x,"suffix"), sep=""))
}

as.percent <- function(x,precision=3)
{
  class(x) <- c(class(x),"formatted")
  attr(x,"scaleFactor")<-100
  attr(x,"prefix")<-""
  attr(x,"suffix")<-"%"
  attr(x,"precision")<-precision
  return(x)
}

as.currency <- function(x,prefix="£")
{
  class(x) <- c(class(x),"formatted")
  attr(x,"scaleFactor")<-1
  attr(x,"prefix")<-prefix
  attr(x,"suffix")<-""
  attr(x,"precision")<-2
  return(x)
}

as.percent(runif(3))
[1] "21.585%" "12.396%" "37.744%"

x <- as.currency(rnorm(3,500,100))
x
[1] "£381.93" "£339.49" "£521.74"
2*x
[1] "£763.86"  "£678.98"  "£1043.48"
like image 41
James Avatar answered Oct 15 '22 16:10

James


I think this could be done through attributes, e.g. let v <- 3.4. If it is pounds Sterling, we could use something like:

attributes(v)<-list(style = "descriptor", type = "currency", category = "pound")

If it is a percentage:

attributes(v)<-list(style = "descriptor", type = "proportion", category = "percentage")

Then, a special print method would be necessary. One could also incorporate a translation method, e.g. to convert from GBP to USD (pounds to dollars), centimeters to inches, etc.

The descriptor is essentially my view on a reserved kind of flag for indicating special handling for the given number. This could later extend to text strings, such as addresses and names. For other numbers, such as phone numbers, there may be special decompositions into country code, intra-country area/regional codes, all the way down to extensions.

Such a package may be akin to ggplot for data types - special methods for storing, transforming, and printing things within types?

Such a system might ensure that dimensions are correct when multiplying values. That has real utility in a lot of applications.

To my knowledge, the only widespread handling of units in R is for bytes (bytes, KB, MB, etc.) and time (hours, seconds, etc.). Even so, the handling, while simple, isn't obvious - I still have to tell print the units to use. For instance, If I want to print an object's size in KB, I can't simply calculate object.size(v)/1024 - the output is reported in fractions of a byte, rather than KB; I have to use print(object.size(v), units = "K").

like image 26
Iterator Avatar answered Oct 15 '22 15:10

Iterator