I am going through one of my .R files and by cleaning it up a little bit I am trying to get more familiar with writing the code the r-ight way. As a beginner, one of my favorite starting points is to get rid of the for()
loops and try to transform the expression into a functional programming form.
So here is the scenario:
I am assembling a bunch of data.frames
into a list
for later usage.
dataList <- list (dataA,
dataB,
dataC,
dataD,
dataE
)
Now I like to take a look at each data.frame's column names and substitute certain character strings. Eg I like to substitute each "foo" and "bar" with "baz". At the moment I am getting the job done with a for()
loop which looks a bit awkward.
colnames(dataList[[1]])
[1] "foo" "code" "lp15" "bar" "lh15"
colnames(dataList[[2]])
[1] "a" "code" "lp50" "ls50" "foo"
matchVec <- c("foo", "bar")
for (i in seq(dataList)) {
for (j in seq(matchVec)) {
colnames (dataList[[i]])[grep(pattern=matchVec[j], x=colnames (dataList[[i]]))] <- c("baz")
}
}
Since I am working here with a list
I thought about the lapply
function. My attempts handling the job with the lapply
function all seem to look alright but only at first sight. If I write
f <- function(i, xList) {
gsub(pattern=c("foo"), replacement=c("baz"), x=colnames(xList[[i]]))
}
lapply(seq(dataList), f, xList=dataList)
the last line prints out almost what I am looking for. However, if i take another look at the actual names of the data.frames in dataList:
lapply (dataList, colnames)
I see that no changes have been made to the initial character strings.
So how can I rewrite the for()
loop and transform it into a functional programming form?
And how do I substitute both strings, "foo" and "bar", in an efficient way? Since the gsub()
function takes as its pattern
argument only a character vector of length one.
Your code almost works -- but remember that R creates copies of the objects that you modify (i.e. pass-by-value semantics). So you need to explicitly assign the new string to colnames, like so:
dataA <- dataB <- data.frame(matrix(1:20,ncol=5))
names(dataA) <- c("foo","code","lp15","bar","lh15")
names(dataB) <- c("a","code","lp50","ls50","foo")
dataList <- list(dataA, dataB)
f <- function(i, xList) {
colnames(xList[[i]]) <- gsub(pattern=c("foo|bar"), replacement=c("baz"), x=colnames(xList[[i]]))
xList[[i]]
}
dataList <- lapply(seq(dataList), f, xList=dataList)
The new list will have data frames with the replaced names. In terms of replacing both foo and bar, just use an alternate pattern in the regex in gsub ("foo|bar").
Note, by the way, that you don't have to do this by indexing into your list -- just use a function that operates on the elements of your list directly:
f <- function(df) {
colnames(df) <- gsub(pattern=c("foo|bar"), replacement=c("baz"), x=colnames(df))
df
}
dataList <- lapply(dataList, f)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With