Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change letter case of column names

Tags:

r

I have a large number of data sets each containing a long list of column names. In some files the column names are all capital letters and in some files only the first letter of the column names is capitalized. I need to append the data sets and thought the easiest way to match column names among data sets would be to convert the all-capital names into names with only the first letter capitalized.

I am hoping to find a general solution, maybe even a one-liner.

Here is my example data set. The desired names are included in the names statements.

my.data2 <-  "
landuse units grade CLAY    LINCOLN  BASINANDRANGE  MCCARTNEY  MAPLE
apple   acres AAA     0         2          3             4         6
apple   acres AA   1000       900         NA            NA       700
pear    acres AA   10.0        20         NA          30.0        40
peach   acres AAA   500       400        350           300       200
"
my.data2 <- read.table(textConnection(my.data2), header=TRUE)

names(my.data2)[names(my.data2)=="CLAY"]            <- "Clay"
names(my.data2)[names(my.data2)=="BASINANDRANGE"]   <- "BasinandRange"
names(my.data2)[names(my.data2)=="LINCOLN"]         <- "Lincoln"
names(my.data2)[names(my.data2)=="MCCARTNEY"]       <- "McCartney"
names(my.data2)[names(my.data2)=="MAPLE"]           <- "Maple"

my.data2

Note that I included the names McCartney and BasinandRange to make things more realistic and more difficult. However, if I can find a one-liner to deal with 95% of the names and use the above names statements to deal with complications like McCartney and BasinandRange that would be great.

I have searched the internet, including the StackOverflow archives, without finding a solution. Sorry if I overlooked one. Thank you for any help.

like image 382
Mark Miller Avatar asked Nov 06 '12 19:11

Mark Miller


People also ask

How do you capitalize column names in pandas?

Convert Column Names to Uppercase using str. where, df is the input dataframe and columns is the attribute to get the column labels as an Index Object. Then using the StringMethods. upper() we converted all labels to uppercase. It converted all the column labels to uppercase.

How do you change capital letters to lowercase in R?

tolower() method in R programming is used to convert the uppercase letters of string to lowercase string. Return: Returns the lowercase string.

What code chunk lets the analyst change all the column names to lowercase in R?

What code chunk lets the analyst change all the column names to lowercase? The rename_with() function will enable the analyst to easily change the case of the column names to lowercase. Including the tolower argument indicates that all column names will be changed to lowercase.


9 Answers

Here is a one-liner implementing "the easiest way to match column names among data sets" that I can think of:

## Columns 1:3 left unaltered since they are not place names.
names(my.data2)[-1:-3] <- tolower(names(my.data2)[-1:-3])

## View the results
names(my.data2)
# [1] "landuse"       "units"         "grade"         "clay"         
# [5] "lincoln"       "basinandrange" "mccartney"     "maple"   
like image 54
Josh O'Brien Avatar answered Oct 04 '22 16:10

Josh O'Brien


Easy Solution

names(DF) <- toupper(names(DF))

like image 43
Varun Tandra Avatar answered Oct 04 '22 17:10

Varun Tandra


modern solution

This is now a job for janitor::clean_names(), just choose case parameter that fits you need.

like image 38
ikashnitsky Avatar answered Oct 04 '22 15:10

ikashnitsky


data.table syntax, I believe would save more time and efficient. its also a one line statement, even shorter.

library(data.table)
setnames(my.data2, tolower(names(my.data2[4:8])))

#   landuse units grade clay lincoln basinandrange mccartney maple
#1:   apple acres   AAA    0       2             3         4     6
#2:   apple acres    AA 1000     900            NA        NA   700
#3:    pear acres    AA   10      20            NA        30    40
#4:   peach acres   AAA  500     400           350       300   200
like image 33
linkonabe Avatar answered Oct 04 '22 17:10

linkonabe


Combining two of the answers here, I've come up with an elegant tidy way:

This renames all column/variable names by capitalising the first letter of every word.

library(tidyverse)

my.data2 %>%
  rename_with(str_to_title)
like image 35
Will M Avatar answered Oct 04 '22 17:10

Will M


A "tidy" solution:

library(dplyr)

my.data2.mod <- my.data2 %>% 
  rename_at(c("CLAY", "LINCOLN", "BASINANDRANGE", "MCCARTNEY",  "MAPLE"),
            .funs = tolower)

names(my.data2.mod) 
# [1] "landuse"       "units"         "grade"         "clay"         
# [5] "lincoln"       "basinandrange" "mccartney"     "maple"   

Also, to answer the original question and leave some cases capitalized, you can use the snakecase package:

library(snakecase)

my.data2.mod = my.data2 %>% 
  rename_at(
    c("CLAY", "LINCOLN", "BASINANDRANGE", "MCCARTNEY",  "MAPLE"),
    .funs = list(
      ~ to_upper_camel_case(., 
                            abbreviations = c("McCartney", "BasinandRange")
                            )
      )
    )

names(my.data2.mod)
# [1] "landuse"       "units"         "grade"         "Clay"         
# [5] "Lincoln"       "BasinandRange" "McCartney"     "Maple" 
like image 21
Matthias Schmidtblaicher Avatar answered Oct 04 '22 16:10

Matthias Schmidtblaicher


Another option:

colnames(df) <- stringr::str_to_title(colnames(df))
like image 20
AlexB Avatar answered Oct 04 '22 16:10

AlexB


I used Josh O'Brien's answer, but eventually wrote the code below that creates column names with the first letter in upper case and the other letters in lower case, with a few exceptions handled as in the original post. Below I used the same data set as in the original post, but read that data into R differently where n.col determines the number of columns in the data file:

n.col <- as.numeric(length(scan("c:/users/mark w miller/simple R programs/names_with_capital_letters.txt", 
         what="character", nlines=1)))

my.data2 <- read.table(file = "c:/users/mark w miller/simple R programs/names_with_capital_letters.txt", 
            na.string=NA, header = T, colClasses = c('character', 'character', 'character', 
            rep('numeric', (n.col[1] - 3))))

first.letter  <- substring(names(my.data2)[-1:-3], 1, 1)
other.letters <- tolower(substring(names(my.data2)[-1:-3], 2))
newnames      <- paste(first.letter, other.letters, sep="")

names(my.data2)[-1:-3] <- newnames
names(my.data2)[names(my.data2)=="Basinandrange"]   <- "BasinandRange"
names(my.data2)[names(my.data2)=="Mccartney"]       <- "McCartney"

my.data2

#   landuse units grade Clay Lincoln BasinandRange McCartney Maple
# 1   apple acres   AAA    0       2             3         4     6
# 2   apple acres    AA 1000     900            NA        NA   700
# 3    pear acres    AA   10      20            NA        30    40
# 4   peach acres   AAA  500     400           350       300   200
like image 27
Mark Miller Avatar answered Oct 04 '22 16:10

Mark Miller


This will make every column upper case.

rename_with(names,toupper)
like image 39
HA LIM PARK Avatar answered Oct 04 '22 17:10

HA LIM PARK