Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wrapping strings, but not substrings in quotes, using R

This question is related to my question about Roxygen.

I want to write a new function that does word wrapping of strings, similar to strwrap or stringr::str_wrap, but with the following twist: Any elements (substrings) in the string that are enclosed in quotes must not be allowed to wrap.

So, for example, using the following sample data

test <- "function(x=123456789, y=\"This is a long string argument\")"
cat(test)
function(x=123456789, y="This is a long string argument")

strwrap(test, width=40)
[1] "function(x=123456789, y=\"This is a long"
[2] "string argument\")"      

I want the desired output of a newWrapFunction(x, width=40, ...) to be:

desired <- c("function(x=123456789, ", "y=\"This is a long string argument\")")
desired
[1] "function(x=123456789, "               
[2] "y=\"This is a long string argument\")"

identical(desired, newWrapFunction(tsring, width=40))
[1] TRUE

Can you think of a way to do this?


PS. If you can help me solve this, I will propose this code as a patch to roxygen2. I have identified where this patch should be applied and will acknowledge your contribution.

like image 893
Andrie Avatar asked Nov 05 '22 14:11

Andrie


1 Answers

Here's what I did to get strwrap so it would not break single quoted sections on spaces: A) Pre-process the "even" sections after splitting by the single-quotes by substituting "~|~" for the spaces: Define new function strwrapqt

 ....  
 zz <- strsplit(x, "\'") # will be only working on even numbered sections
   for (i in seq_along(zz) ){ 
       for (evens in seq(2, length(zz[[i]]), by=2)) {
            zz[[i]][evens] <- gsub("[ ]", "~|~", zz[[i]][evens])}
                       }
 zz <- unlist(zz) 
  .... insert just before
 z <- lapply(strsplit) ...........

Then at the end replace all the "~|~" with spaces. It might be necessary to doa lot more thinking about the other sorts of whitespace "events" to get a fully regular treatment.

....
 y <- gsub("~\\|~", " ", y)
....

Edit: Tested @joran's suggestion. Matching single and double quotes would be a difficult task with the methods I am using but if one were willing to consider any quote as equally valid as a separator target, one could just use zz <- strsplit(x, "\'|\"") as the splitting criterion in the code above.

like image 111
IRTFM Avatar answered Nov 10 '22 19:11

IRTFM