Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Remove leading zeroes from the beginning of a character string

I have first referred to this question, but the answers did not help in my case.

I have a list where each component contains elements starting with numbers, followed with words (characters). Some of the numbers at the beginning of the elements have one or more leading zeroes. Here is small part of the list:

x <- list(el1 = c("0010 First",
                  "0200 Second",
                  "0300 Third",
                  "4000 Fourth",
                  "0 Undefined",
                  "60838 Random",
                  "903200 Haphazard"),
          el2 = c("0100 Hundredth",
                  "0200 Two hundredth",
                  "0300 Three hundredth",
                  "0040 Fortieth",
                  "0 Undefined",
                  "949848 Random",
                  "202626 Haphazard"),
          el3 = c("0010 First",
                  "0200 Second",
                  "0300 Third",
                  "0100 Hundredth",
                  "0200 Two hundredth",
                  "0300 Three hundredth",
                  "0 Undefined",
                  "60838 Random",
                  "20200 Haphazard"))

What I want to achieve is to remove the leading zeros where they are available and still have the single zero at the beginning of 0 Undefined plus all other elements that do not start with leading zeroes. That is, to have the list as follow:

x <- list(el1 = c("10 First",
                  "200 Second",
                  "300 Third",
                  "4000 Fourth",
                  "0 Undefined",
                  "60838 Random",
                  "903200 Haphazard"),
          el2 = c("100 Hundredth",
                  "200 Two hundredth",
                  "300 Three hundredth",
                  "40 Fortieth",
                  "0 Undefined",
                  "949848 Random",
                  "202626 Haphazard"),
          el3 = c("10 First",
                  "200 Second",
                  "300 Third",
                  "100 Hundredth",
                  "200 Two hundredth",
                  "300 Three hundredth",
                  "0 Undefined",
                  "60838 Random",
                  "20200 Haphazard"))

I have been going for hours now without success. The best I could do is this:

lapply(x, function(i) {
  ifelse(grep(pattern = "^0+[1-9]", x = i),
         gsub(pattern = "^0+", replacement = "", x = i), i)
})

However, it just returns those elements in the list components where there were leading zeroes, but not the rest without and also without 0 Undefined.

Can someone help?

like image 251
panman Avatar asked Sep 27 '15 20:09

panman


People also ask

How do I get leading zeros in R?

Add Leading Zeros to the Elements of a Vector in R Programming – Using paste0() and sprintf() Function. paste0() and sprintf() functions in R Language can also be used to add leading zeros to each element of a vector passed to it as argument.


1 Answers

We loop through the list (lapply(x, ..)), use sub to replace the leading zeros in the list elements. We match one of more zero's from the beginning of the string (^0+) followed by numbers 1-9 specified by the positive regex lookahead ((?=[1-9])) and replace it with ''.

lapply(x, function(y) sub('^0+(?=[1-9])', '', y, perl=TRUE))

Or as @hwnd mentioned in the comments, we can use capture group i.e. instead of lookahead.

lapply(x, function(y) sub('^0+([1-9])', '\\1', y))

Or without using the anonymous function, we can specify the pattern and replacement arguments of sub

lapply(x, sub, pattern='^0+([1-9])', replacement='\\1')
like image 168
akrun Avatar answered Oct 06 '22 01:10

akrun