Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Applying dplyr's rename to all columns while using pipe operator

I'm working with an imported data set that corresponds to the extract below:

set.seed(1)
dta <- data.frame("This is Column One" = runif(n = 10),
                     "Another amazing Column name" = runif(n = 10),
                     "!## This Columns is so special€€€" = runif(n = 10),
                    check.names = FALSE)

I'm doing some cleaning on this data using dplyr and I would like to change column names to syntatically correct ones and remove the punctuation as a second step. What I tried so far:

dta_cln <- dta %>% 
    rename(make.names(names(dta)))

generates an error:

> dta_clean <- dta %>% 
+     rename(make.names(names(dta)))
Error: All arguments to rename must be named.

Desired result

What I wan to achieve can be done in base:

names(dta) <- gsub("[[:punct:]]","",make.names(names(dta)))

which would return:

> names(dta)
[1] "ThisisColumnOne"          "AnotheramazingColumnname" "XThisColumnsissospecial"

I want to achieve the same effect but using dyplr and %>%.

like image 561
Konrad Avatar asked Dec 04 '15 15:12

Konrad


People also ask

How do I rename multiple columns in dplyr?

To change multiple column names by name and by index use rename() function of the dplyr package and to rename by just name use setnames() from data. table . From R base functionality, we have colnames() and names() functions that can be used to rename a data frame column by a single index or name.

How do I change column names in R with dplyr?

rename() function from dplyr takes a syntax rename(new_column_name = old_column_name) to change the column from old to a new name. The following example renames the column from id to c1 . The operator – %>% is used to load the renamed column names to the data frame.

How do I mass rename a column in R?

The easiest way to rename columns in R is by using the setnames() function from the “data. table” package. This function modifies the column names given a set of old names and a set of new names. Alternatively, you can also use the colnames() function or the “dplyr” package.

What is the fastest way to rename a column in R?

Rename Column using colnames() colnames() is the method available in R base which is used to rename columns/variables present in the data frame. By using this you can rename a column by index and name. Alternatively, you can also use name() method.


1 Answers

I know this is an old question, and I'm sure you found the solution by now, but I stumbled here searching for the same question, and ultimately found a few new ways to do this.

Dplyr

Using dplyr 0.6.0 and above, there is now a rename_all function:

  dta %>% 
    rename_all(funs(gsub("[[:punct:]]", "", make.names(names(dta)))))

Which works, but it's a little messy to me. If you want more flexibility with dplyr, you can also call on:

  • rename_at
  • rename_if

Janitor

This is a pretty nice package (with plenty of additional utility) that can easily clean up column names:

library(janitor)

dta %>% 
  clean_names()

Which will rename and clean all column names to the following:

[1] "this_is_column_one"  "another_amazing_column_name"  "x_this_columns_is_so_special"

Everything becomes snake_case rather than CamelCase, but overall clean_names is very flexible in the column names it handles. If that IS a deal breaker, you can use yet another package snakecase for its function to_big_camel_case() within the rename_all function...although that is starting to get a little too esoteric

like image 123
Dave Gruenewald Avatar answered Oct 22 '22 12:10

Dave Gruenewald