gsub("(?<![0-9])0+", "", c("005", "0AB", "000", "0"), perl = TRUE)
#> [1] "5"  "AB" ""   ""
gsub("(^|[^0-9])0+", "\\1", c("005", "0AB", "000", "0"), perl = TRUE)
#> [1] "5"  "AB" ""   ""
The regular expression above is from this SO thread explaining how to remove all leading zeros from a string in R. As a consequence of this regular expression both "000" and "0" are transformed into "". Instead I want to remove all leading zeros from a string of characters, except for the cases when the final character happens to be zero, or the only character is zero.
"005" would become "5"
"0AB" would become "AB"
"000" would become "0"
"0"   would become "0"
This other SO thread explains how to do what I want, but I don't think I'm getting the syntax quite correct, applying the solution in R. And I don't really understand the distinction between the 1st and 2nd solution below (if they indeed worked).
gsub("s/^0*(\d+)$/$1/;", "", c("005", "0AB", "000", "0"), perl = TRUE)  # 1st solution
# Error: '\d' is an unrecognized escape in character string starting ""s/^0*(\d"
gsub("s/0*(\d+)/$1/;", "", c("005", "0AB", "000", "0"), perl = TRUE)    # 2nd solution
# Error: '\d' is an unrecognized escape in character string starting ""s/0*(\d"
What is the proper regex in R to get what I want?
You may remove all zeros from the start of a string but not the last one:
sub("^0+(?!$)", "", x, perl=TRUE)
See the regex demo.
Details
^ - start of a string0+ - one or more zeros(?!$) - a negative lookahead that fails the match if there is an end of string position immediately to the right of the current locationSee the R demo:
x <- c("005", "0AB", "000", "0")
sub("^0+(?!$)", "", x, perl=TRUE)
## => [1] "5"  "AB" "0"  "0"
By using a non word boundary \B. See this demo at regex101 or an R demo at tio.run.
sub("^0+\\B", "", s)
This will not match the last zero, because right of it there is no word character.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With