I am handling a DB with hour format like:
HOUR ID 1 2 10 4 5 6 20 6
I would like to place a zero in the value with 1 character and store them in a new column named NHOUR, like:
NHOUR HOUR ID 01 1 2 10 10 4 05 5 6 20 20 6
Until now I am struggling with something like (I follow some suggestions already provided for ifelse in the forum) :
DB$NHOUR<-with(DB,ifelse(nchar(HOUR,type="chars")==1),sprintf("%02d",HOUR),as.numeric(HOUR))
but without any success! R always reports "yes" element is not specified, etc.
As always, any tips is appreciated!
A leading zero is any 0 digit that comes before the first nonzero digit in a number string in positional notation. For example, James Bond's famous identifier, 007, has two leading zeros. When leading zeros occupy the most significant digits of an integer, they could be left blank or omitted for the same numeric value.
Zero padding is a technique typically employed to make the size of the input sequence equal to a power of two. In zero padding, you add zeros to the end of the input sequence so that the total number of samples is equal to the next higher power of two.
Simply following the advise in @joran's comment,
DB <- data.frame( HOUR = c(1, 10, 5, 20), ID = c(2, 4, 6, 6)) NHOUR <- sprintf("%02d",DB$HOUR) # fix to 2 characters cbind(NHOUR, DB) # combine old and newdata NHOUR HOUR ID 1 01 1 2 2 10 10 4 3 05 5 6 4 20 20 6
Update 2013-01-21 23:42:00Z Inspired by daroczig's performance test below, and because I wanted to try out the microbenchmark package, I've updated this question with a small performance test of my own comparing the three different solutions suggested in this thread.
# install.packages(c("microbenchmark", "stringr"), dependencies = TRUE) require(microbenchmark) require(stringr) SPRINTF <- function(x) sprintf("%02d", x) FORMATC <- function(x) formatC(x, width = 2,flag = 0) STR_PAD <- function(x) str_pad(x, width=2, side="left", pad="0") x <- round(runif(1e5)*10) res <- microbenchmark(SPRINTF(x), STR_PAD(x), FORMATC(x), times = 15) ## Print results: print(res) Unit: milliseconds expr min lq median uq max 1 FORMATC(x) 623.53785 629.69005 638.78667 671.22769 679.8790 2 SPRINTF(x) 34.35783 34.81807 35.04618 35.53696 37.1622 3 STR_PAD(x) 116.54969 118.41944 118.97363 120.05729 163.9664 ### Plot results: boxplot(res)
I like to use the stringr
package:
DB$NHOUR <- str_pad(DB$HOUR, width=2, side="left", pad="0")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With