Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rename columns in multiple dataframes, R

Tags:

dataframe

r

I am trying to rename columns of multiple data.frames.

To give an example, let's say I've a list of data.frames dfA, dfB and dfC. I wrote a function changeNames to set names accordingly and then used lapply as follows:

dfs <- list(dfA, dfB, dfC)
ChangeNames <- function(x) {
    names(x) <- c("A", "B", "C" )  
}
lapply(dfs, ChangeNames)

However, this doesn't work as expected. It seems that I am not assigning the new names to the data.frame, rather only creating the new names. What am I doing wrong here?

Thank you in advance!

like image 622
user2706593 Avatar asked Aug 22 '13 09:08

user2706593


2 Answers

If the dataframes were not in a list but just in the global environment, you could refer to them using a vector of string names.

dfs <- c("dfA", "dfB", "dfC")

for(df in dfs) {
  df.tmp <- get(df)
  names(df.tmp) <- c("A", "B", "C" ) 
  assign(df, df.tmp)
}

EDIT

To simplify the above code you could use

for(df in dfs)
  assign(df, setNames(get(df),  c("A", "B", "C")))

or using data.table which doesn't require reassigning.

for(df in c("dfA", "dfB"))
  data.table::setnames(get(df),  c("G", "H"))
like image 182
JWilliman Avatar answered Oct 25 '22 11:10

JWilliman


There are two things here:

  • 1) You should return the value you want from your function. Else, the last value will be returned. In your case, that's names(x). So, instead you should add as the final line, return(x) or simply x. So, your function would look like:

    ChangeNames <- function(x) {
        names(x) <- c("A", "B", "C" )
        return(x)
    }
    
  • 2) lapply does not modify your input objects by reference. It works on a copy. So, you'll have to assign the results back. Or another alternative is to use for-loops instead of lapply:

    # option 1
    dfs <- lapply(dfs, ChangeNames)
    
    # option 2
    for (i in seq_along(dfs)) {
        names(dfs[[i]]) <- c("A", "B", "C")
    }
    

Even using the for-loop, you'll still make a copy (because names(.) <- . does). You can verify this by using tracemem.

df <- data.frame(x=1:5, y=6:10, z=11:15)
tracemem(df)
# [1] "<0x7f98ec24a480>"
names(df) <- c("A", "B", "C")
tracemem(df)
# [1] "<0x7f98e7f9e318>"

If you want to modify by reference, you can use data.table package's setnames function:

df <- data.frame(x=1:5, y=6:10, z=11:15)
require(data.table)
tracemem(df)
# [1] "<0x7f98ec76d7b0>"
setnames(df, c("A", "B", "C"))
tracemem(df)
# [1] "<0x7f98ec76d7b0>"

You see that the memory location df is mapped to hasn't changed. The names have been modified by reference.

like image 30
Arun Avatar answered Oct 25 '22 10:10

Arun