I am trying to rename columns of multiple data.frame
s.
To give an example, let's say I've a list of data.frame
s dfA
, dfB
and dfC
. I wrote a function changeNames
to set names accordingly and then used lapply
as follows:
dfs <- list(dfA, dfB, dfC)
ChangeNames <- function(x) {
names(x) <- c("A", "B", "C" )
}
lapply(dfs, ChangeNames)
However, this doesn't work as expected. It seems that I am not assigning the new names to the data.frame
, rather only creating the new names. What am I doing wrong here?
Thank you in advance!
If the dataframes were not in a list but just in the global environment, you could refer to them using a vector of string names.
dfs <- c("dfA", "dfB", "dfC")
for(df in dfs) {
df.tmp <- get(df)
names(df.tmp) <- c("A", "B", "C" )
assign(df, df.tmp)
}
EDIT
To simplify the above code you could use
for(df in dfs)
assign(df, setNames(get(df), c("A", "B", "C")))
or using data.table
which doesn't require reassigning.
for(df in c("dfA", "dfB"))
data.table::setnames(get(df), c("G", "H"))
There are two things here:
1) You should return the value you want from your function. Else, the last value will be returned. In your case, that's names(x)
. So, instead you should add as the final line, return(x)
or simply x
. So, your function would look like:
ChangeNames <- function(x) {
names(x) <- c("A", "B", "C" )
return(x)
}
2) lapply
does not modify your input objects by reference. It works on a copy. So, you'll have to assign the results back. Or another alternative is to use for-loops
instead of lapply
:
# option 1
dfs <- lapply(dfs, ChangeNames)
# option 2
for (i in seq_along(dfs)) {
names(dfs[[i]]) <- c("A", "B", "C")
}
Even using the for-loop
, you'll still make a copy (because names(.) <- .
does). You can verify this by using tracemem
.
df <- data.frame(x=1:5, y=6:10, z=11:15)
tracemem(df)
# [1] "<0x7f98ec24a480>"
names(df) <- c("A", "B", "C")
tracemem(df)
# [1] "<0x7f98e7f9e318>"
If you want to modify by reference, you can use data.table
package's setnames
function:
df <- data.frame(x=1:5, y=6:10, z=11:15)
require(data.table)
tracemem(df)
# [1] "<0x7f98ec76d7b0>"
setnames(df, c("A", "B", "C"))
tracemem(df)
# [1] "<0x7f98ec76d7b0>"
You see that the memory location df
is mapped to hasn't changed. The names have been modified by reference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With