HI I'm using getParseData
to fix something in the code. For example replace =
with <-
txt = "
flag = F
if(flag){
dat = data.frame(x = 1, stringAsFactor = F)
} else {
dat <- 1
}
"
sf = parse(text = txt)
p = getParseData(sf)
p[p$token == 'EQ_ASSIGN', 'text'] = '<-'
Now how to go back from p
- a data.frame showing the parsed code - to R code as a string? Thanks
update: try getParseText
firstly I can't get the example in getParseText
to work:
fn <- function(x) {
x + 1 # A comment, kept as part of the source
}
d <- getParseData(fn)
d
# NULL
Then I try to mimic the code in ?getParseText
in my example:
txt = "
flag = F
f2 = 1 + 1
if(flag){
dat = data.frame(x = 1, stringAsFactor = F)
} else {
dat <- 1
}
"
sf = parse(text = txt)
p = getParseData(sf)
plus <- which(p$token == "'+'")
sum <- p$parent[plus]
p[as.character(sum), ]
cat(getParseText(p, sum))
# 1 + 1
cat(getParseText(p, unique(p$parent)))
# not correct
cat(paste0(unique(getParseText(p, p$id)), collapse=" ")) # incorrect
# flag = F f2 1 + 1 1 + if(flag){
# dat = data.frame(x = 1, stringAsFactor = F)
# } else {
# dat <- 1
# } if ( ) {
# dat = data.frame(x = 1, stringAsFactor = F)
# } { dat data.frame(x = 1, stringAsFactor = F) data.frame x , stringAsFactor } else {
# dat <- 1
# } dat <- 1 <-
You don't need to use getParseData
to replace the =
symbols with <-
in R code. One of the amazing things about R is that you can operate directly on the language so we will do that here.
sf = parse(text = txt)
sf
is an expression object, which effectively is a list of R language objects, one for each top-level statement:
sf[[1]]
## flag = F
sf[[2]]
## if (flag) {
## dat = data.frame(x = 1, stringAsFactor = F)
## } else {
## dat <- 1
## }
The above are language calls. A call is a an unevaluated R statement, which you get from parse
, or with quote
:
my.call <- quote(1 + 1)
my.call
## 1 + 1
class(my.call)
## [1] "call"
The thing about calls is that R lies to you about their underlying structures. R calls are lists (well, pairlists really but the distinction is irrelevant here) that R displays and treats specially. We can show their true nature:
as.list(my.call)
## [[1]]
## `+`
##
## [[2]]
## [1] 1
##
## [[3]]
## [1] 1
Note how the leading element of the call is the "function", or in this case the "operator", which in R is just a function anyway (more precisely the name of the function/operator). This is always the case with calls. The first element is the function, the following elements are arguments. R pretends operators are special and displays them differently, but for underlying call structure and evaluation they are the same.
Look at what we can do:
my.call[[1]] <- as.name('-')
my.call
## 1 - 1
We used as.name
to create a special type of R object called a symbol. These can be used to reference functions in calls. As you can imagine, if we can replace +
with -
, we can also do the same for =
and <-
. But to do it systematically we need to recurse through the language tree. We will write a simple function that does just that:
symb_rep <- function(lang, from, to) {
if(is.call(lang)) {
if(lang[[1]] == from) lang[[1]] <- to
lang[-1] <- lapply(lang[-1], symb_rep, from, to)
}
lang
}
Then we can run it on our original expression, which recall is a list of calls, so we use lapply
to apply to each element:
lang.sub <- lapply(sf, symb_rep, as.name("="), as.name("<-"))
lang.sub
## [[1]]
## flag <- F
##
## [[2]]
## if (flag) {
## dat <- data.frame(x = 1, stringAsFactor = F)
## } else {
## dat <- 1
## }
If you want the character representation back you can use deparse
:
unlist(lapply(lang.sub, deparse))
## [1] "flag <- F"
## [2] "if (flag) {"
## [3] " dat <- data.frame(x = 1, stringAsFactor = F)"
## [4] "} else {"
## [5] " dat <- 1"
## [6] "}"
Pretty cool, no?
One last point, you'll notice that in data.frame(x = 1, ...)
, the =
was not replaced. Why is that? Well, that =
does not actually exist in the call data. It is displayed by R as decoration. In reality, notice how the parameter names are stored:
as.list(quote(data.frame(x=1, y=2)))
## [[1]]
## data.frame
##
## $x
## [1] 1
##
## $y
## [1] 2
There is no =
in sight, because the parameter names are just names of the call object. R just displays equal signs as a visual aid when it prints out the call, and interprets them the same way when it parses the call. This "lie" is why these two expressions are semantically distinct:
data.frame(x <- 5)
## x....5
## 1 5
x
## [1] 5
data.frame(x = 1)
## x
## 1 1
x
## [1] 5
In the former R assigns 5 to x in the global environment, and makes up a name for the data frame parameter. In the second R uses x
as the parameter name but doesn't assign to the global environment. Because R relies on the =
in parameter lists to get parameter names it cannot use it with the normal assignment semantics.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With