How to replace exact number of characters in string based on occurrence between delimitors in R

Tags:

I have text strings like this:

u <- "she goes ~Wha::?~ and he's like ~↑Yeah believe me!~ and she's etc."

What I'd like to do is replace all characters occurring between pairs of ~ delimitors (including the delimitors themselves) by, say, X.

This gsub method replaces the substrings between ~-delimitor pairs with a single X:

gsub("~[^~]+~", "X", u)
[1] "she goes X and he's like X and she's etc."

However, what I'd really like to do is replace each and every single character between the delimitors (and the delimitors themselves) by X. The desired output is this:

"she goes XXXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."

I've been experimenting with nchar, backreference, and paste as follows but the result is incorrect:

gsub("(~[^~]+~)", paste0("X{", nchar("\\1"),"}"), u)
[1] "she goes X{2} and he's like X{2} and she's etc."

Any help is appreciated.

807

asked Oct 14 '20 11:10

Chris Ruehlemann

1 Answers

The paste0("X{", nchar("\\1"),"}") code results in X{2} because "\\1" is a string of length 2. \1 is not interpolated as a backreference if you do not use it in a string pattern.

You can use the following solution based on stringr:

> u <- "she goes ~Wha::?~ and he's like ~↑Yeah believe me!~ and she's etc."
> str_replace_all(u, '~[^~]+~', function(x) str_dup("X", nchar(x)))
[1] "she goes XXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."

Upon finding a match with ~[^~]+~, the value is passed to the anonymous function and str_dup creates a string out of X that is the same length as the match value.

answered Oct 18 '22 18:10

Wiktor Stribiżew

Related questions
                            
                                regression models in r output table to word
                            
                                Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match R
                            
                                Create "The Economist" Style Plots in R?
                            
                                How can create a function using variables in a dataframe
                            
                                Getting rows in data frame based on mutiple ranges in R
                            
                                How to label only the modal peak in a geom_col plot
                            
                                How to add a point on the y-intercept (y-axis) using ggplot2
                            
                                How to correctly set up rpy2?
                            
                                Is there an R function "parallel sum"? [duplicate]
                            
                                How to match distinct repeated characters
                            
                                How do you convert object of class Eigen::MatrixXd to class Rcpp::NumericMatrix
                            
                                Display YAML and chunks without executing them in blogdown
                            
                                How can I use dplyr across() programmatically on no variables?
                            
                                How to use label_wrap_gen with as_labeller in facet_wrap
                            
                                Is there an R function to replace a matched RegEx with a string of characters with the same length? [duplicate]
                            
                                render dropdown for single column in DT shiny BUT loaded only on cell click and with replaceData()
                            
                                How can I count the total number of occurrences at time step t of an element?
                            
                                R Shuffle and randomize columns of a data table
                            
                                Change color of leaflet marker
                            
                                Knit PDf file from RStudio

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to replace exact number of characters in string based on occurrence between delimitors in R

Tags:

regex

r

backreference

Chris Ruehlemann

People also ask

1 Answers

Wiktor Stribiżew

Recent Activity

Donate For Us