Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

replace number greater than 5 digits in a text

a <- c("this is a number 9999333333 and i got 12344")

How could i replace the number greater than 5 digits with the extra digits being "X"

Expected Output:

"this is a number 99993XXXXX and i got 12344"

Code i tried:

gsub("(.{5}).*", "X", a)
like image 219
prog Avatar asked Sep 07 '20 12:09

prog


People also ask

How do I replace a number in a string?

To replace all numbers in a string, call the replace() method, passing it a regular expression that globally matches all numbers as the first parameter and the replacement string as the second. The replace method will return a new string with all matches replaced by the provided replacement.

How to replace a digit in a number in c++?

#include <iostream> using namespace std; int replaceDig( int num, int oldDigit, int newDigit) { if(num==0)return 0; int digit = num%10; if(digit==oldDigit)digit = newDigit; return replaceDig(num/10,oldDigit,newDigit)*10+digit; } int main() { int num, newnum, oldDigit, newDigit; cout << "Enter the number: " << endl; cin ...

How to replace a digit in java?

The correct RegEx for selecting all numbers would be just [0-9] , you can skip the + , since you use replaceAll . However, your usage of replaceAll is wrong, it's defined as follows: replaceAll(String regex, String replacement) . The correct code in your example would be: replaceAll("[0-9]", "") .

How does RegEx replace work?

The REGEXREPLACE( ) function uses a regular expression to find matching patterns in data, and replaces any matching values with a new string. standardizes spacing in character data by replacing one or more spaces between text characters with a single space.


3 Answers

You can use gsub with a PCRE regex:

(?:\G(?!^)|(?<!\d)\d{5})\K\d

See the regex demo. Details:

  • (?:\G(?!^)|(?<!\d)\d{5}) - the end of the previous successful match (\G(?!^)) or (|) a location not preceded with a digit ((?<!\d)) and then any five digits
  • \K - match reset operator discarding all text matched so far
  • \d - a digit.

See the R demo:

a <- c("this is a number 9999333333 and i got 12344")
gsub("(?:\\G(?!^)|(?<!\\d)\\d{5})\\K\\d", "X", a, perl=TRUE)
## => [1] "this is a number 99993XXXXX and i got 12344"
like image 58
Wiktor Stribiżew Avatar answered Nov 09 '22 22:11

Wiktor Stribiżew


gsubfn in the gsubfn package is like gsub except the replacement string can be a function which inputs the capture groups and outputs a replacement to the match. The function can optionally be expressed in a formula notation as we do here.

The regular expression (\d{5}) matches and captures 5 digits and (\d+) matches and captures the remaining digits. The two capture groups are fed into the function and are pasted back together except each character in the second is replaced with X. r"{...}" is the notation for string literals introduced in R 4.0 which eliminates having to use double backslashes to denote a backslash within a string literal.

library(gsubfn)

gsubfn(r"{(\d{5})(\d+)}", ~ paste0(x, gsub(".", "X", y)), a)
## [1] "this is a number 99993XXXXX and i got 12344"

If we replace the first argument with the regular expression r"{(\d{2})(\d{4,})}" then it will replace all but the first two digits provided there are at least 6 digits.

like image 29
G. Grothendieck Avatar answered Nov 09 '22 22:11

G. Grothendieck


An alternative way, not using gsub to replace numbers greater than 5 digits in a text is to split the string with strsplit, test if there are only digits and combine a substr and a strrep:

paste(lapply(strsplit(a, " ")[[1]], function(x) {
  if(!grepl("\\D", x)) {
    paste0(substr(x, 1, 5), strrep("X", pmax(0, nchar(x)-5)))
  } else {x}}), collapse = " ")
#[1] "this is a number 99993XXXXX and i got 12344"

To replace X after first 2 digits for numbers greater than 5 digits:

paste(lapply(strsplit(a, " ")[[1]], function(x) {
  if(!grepl("\\D", x) & nchar(x) > 5) {
    paste0(substr(x, 1, 2), strrep("X", pmax(0, nchar(x)-2)))
  } else {x}}), collapse = " ")
#[1] "this is a number 99XXXXXXXX and i got 12344"
like image 21
GKi Avatar answered Nov 10 '22 00:11

GKi