a <- c("this is a number 9999333333 and i got 12344")
How could i replace the number greater than 5 digits with the extra digits being "X"
Expected Output:
"this is a number 99993XXXXX and i got 12344"
Code i tried:
gsub("(.{5}).*", "X", a)
To replace all numbers in a string, call the replace() method, passing it a regular expression that globally matches all numbers as the first parameter and the replacement string as the second. The replace method will return a new string with all matches replaced by the provided replacement.
#include <iostream> using namespace std; int replaceDig( int num, int oldDigit, int newDigit) { if(num==0)return 0; int digit = num%10; if(digit==oldDigit)digit = newDigit; return replaceDig(num/10,oldDigit,newDigit)*10+digit; } int main() { int num, newnum, oldDigit, newDigit; cout << "Enter the number: " << endl; cin ...
The correct RegEx for selecting all numbers would be just [0-9] , you can skip the + , since you use replaceAll . However, your usage of replaceAll is wrong, it's defined as follows: replaceAll(String regex, String replacement) . The correct code in your example would be: replaceAll("[0-9]", "") .
The REGEXREPLACE( ) function uses a regular expression to find matching patterns in data, and replaces any matching values with a new string. standardizes spacing in character data by replacing one or more spaces between text characters with a single space.
You can use gsub
with a PCRE regex:
(?:\G(?!^)|(?<!\d)\d{5})\K\d
See the regex demo. Details:
(?:\G(?!^)|(?<!\d)\d{5})
- the end of the previous successful match (\G(?!^)
) or (|
) a location not preceded with a digit ((?<!\d)
) and then any five digits\K
- match reset operator discarding all text matched so far\d
- a digit.See the R demo:
a <- c("this is a number 9999333333 and i got 12344")
gsub("(?:\\G(?!^)|(?<!\\d)\\d{5})\\K\\d", "X", a, perl=TRUE)
## => [1] "this is a number 99993XXXXX and i got 12344"
gsubfn
in the gsubfn package is like gsub
except the replacement string can be a function which inputs the capture groups and outputs a replacement to the match. The function can optionally be expressed in a formula notation as we do here.
The regular expression (\d{5})
matches and captures 5 digits and (\d+)
matches and captures the remaining digits. The two capture groups are fed into the function and are pasted back together except each character in the second is replaced with X
. r"{...}"
is the notation for string literals introduced in R 4.0 which eliminates having to use double backslashes to denote a backslash within a string literal.
library(gsubfn)
gsubfn(r"{(\d{5})(\d+)}", ~ paste0(x, gsub(".", "X", y)), a)
## [1] "this is a number 99993XXXXX and i got 12344"
If we replace the first argument with the regular expression r"{(\d{2})(\d{4,})}"
then it will replace all but the first two digits provided there are at least 6 digits.
An alternative way, not using gsub
to replace numbers greater than 5 digits in a text is to split the string with strsplit
, test if there are only digits and combine a substr
and a strrep
:
paste(lapply(strsplit(a, " ")[[1]], function(x) {
if(!grepl("\\D", x)) {
paste0(substr(x, 1, 5), strrep("X", pmax(0, nchar(x)-5)))
} else {x}}), collapse = " ")
#[1] "this is a number 99993XXXXX and i got 12344"
To replace X after first 2 digits for numbers greater than 5 digits:
paste(lapply(strsplit(a, " ")[[1]], function(x) {
if(!grepl("\\D", x) & nchar(x) > 5) {
paste0(substr(x, 1, 2), strrep("X", pmax(0, nchar(x)-2)))
} else {x}}), collapse = " ")
#[1] "this is a number 99XXXXXXXX and i got 12344"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With