Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix spaces in column names of a data.frame (remove spaces, inject dots)?

Tags:

dataframe

r

After importing a file, I always try try to remove spaces from the column names to make referral to column names easier.

Is there a better way to do this other then using transform and then removing the extra column this command creates?

This is what I use now:

names(ctm2) #tranform function does this, but requires some action ctm2<-transform(ctm2,dymmyvar=1) #remove dummy column ctm2$dymmyvar <- NULL names(ctm2) 
like image 785
userJT Avatar asked May 21 '12 15:05

userJT


People also ask

How do I remove spaces from a Dataframe column name?

To strip whitespaces from column names, you can use str. strip, str. lstrip and str. rstrip.

How do I remove spaces from a column name in R?

The easiest option to replace spaces in column names is with the clean. names() function. This R function creates syntactically correct column names by replacing blanks with an underscore. Moreover, you can use this function in combination with the %>%-operator from the Tidyverse package.

Can Dataframe column names have spaces?

You can refer to column names that contain spaces or operators by surrounding them in backticks. This way you can also escape names that start with a digit, or those that are a Python keyword.


2 Answers

There exists more elegant and general solution for that purpose:

tidy.name.vector <- make.names(name.vector, unique=TRUE) 

make.names() makes syntactically valid names out of character vectors. A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number.

Additionally, flag unique=TRUE allows you to avoid possible dublicates in new column names.

As code to implement

d<-read_delim(urltxt,delim='\t',) names(d)<-make.names(names(d),unique = TRUE) 
like image 76
Convex Avatar answered Oct 07 '22 01:10

Convex


There is a very useful package for that, called janitor that makes cleaning up column names very simple. It removes all unique characters and replaces spaces with _.

library(janitor)  #can be done by simply ctm2 <- clean_names(ctm2)  #or piping through `dplyr` ctm2 <- ctm2 %>%         clean_names() 
like image 28
camnesia Avatar answered Oct 07 '22 01:10

camnesia