My dataframe which I read from a csv file has column names like this <code>abc.def, ewf.asd.fkl, qqit.vsf.addw.coil</code> I want to remove the '.' from all the names and convert them to <code>abcdef, eqfasdfkl, qqitvsfaddwcoil.</code> I tried using the sub command <code>sub(".","",colnames(dataframe))</code> but this command took out the first letter of each column name and the column names changed to <code>bc.def, wf.asd.fkl, qit.vsf.addw.coil</code> Anyone know another command to do this. I can change the column name one by one, but I have a lot of files with 30 or more columns in each file. Again, I want to remove the "." from all the colnames. I am trying to do this so I can use "sqldf" commands, which don't deal well with "." Thank you for your help

UPDATE dplyr 0.8.0 As of dplyr 0.8 <code>funs()</code> is soft deprecated, use formula notation. a <code>dplyr</code> way to do this using <code>stringr</code>. <pre class="prettyprint"><code>library(dplyr) library(stringr) data <- data.frame(abc.def = 1, ewf.asd.fkl = 2, qqit.vsf.addw.coil = 3) renamed_data <- data %>% rename_all(~str_replace_all(.,"\\.","_")) # note we have to escape the '.' character with \\ </code></pre> Make sure you install the packages with <code>install.packages()</code>. Remember you have to escape the <code>.</code> character with <code>\\.</code> in regex, which functions like <code>str_replace_all</code> use, <code>.</code> is a wildcard.

1) sqldf can deal with names having dots in them if you quote the names: <pre class="prettyprint"><code>library(sqldf) d0 <- read.csv(text = "A.B,C.D\n1,2") sqldf('select "A.B", "C.D" from d0') </code></pre> giving: <pre class="prettyprint"><code> A.B C.D 1 1 2 </code></pre> 2) When reading the data using <code>read.table</code> or <code>read.csv</code> use the <code>check.names=FALSE</code> argument. Compare: <pre class="prettyprint"><code>Lines <- "A B,C D 1,2 3,4" read.csv(text = Lines) ## A.B C.D ## 1 1 2 ## 2 3 4 read.csv(text = Lines, check.names = FALSE) ## A B C D ## 1 1 2 ## 2 3 4 </code></pre> however, in this example it still leaves a name that would have to be quoted in sqldf since the names have embedded spaces. 3) To simply remove the periods, if <code>DF</code> is a data frame: <pre class="prettyprint"><code>names(DF) <- gsub(".", "", names(DF), fixed = TRUE) </code></pre> or it might be nicer to convert the periods to underscores so that it is reversible: <pre class="prettyprint"><code>names(DF) <- gsub(".", "_", names(DF), fixed = TRUE) </code></pre> This last line could be alternatively done like this: <pre class="prettyprint"><code>names(DF) <- chartr(".", "_", names(DF)) </code></pre>

How to remove '.' from column names in a dataframe?

2 Answers

UPDATE dplyr 0.8.0

As of dplyr 0.8 funs() is soft deprecated, use formula notation.

a dplyr way to do this using stringr.

library(dplyr)
library(stringr)

data <- data.frame(abc.def = 1, ewf.asd.fkl = 2, qqit.vsf.addw.coil = 3)
renamed_data <- data %>%
  rename_all(~str_replace_all(.,"\\.","_")) # note we have to escape the '.' character with \\

Make sure you install the packages with install.packages().

Remember you have to escape the . character with \\. in regex, which functions like str_replace_all use, . is a wildcard.

answered Sep 21 '22 17:09

blakiseskream

1) sqldf can deal with names having dots in them if you quote the names:

library(sqldf)
d0 <- read.csv(text = "A.B,C.D\n1,2")
sqldf('select "A.B", "C.D" from d0')

giving:

  A.B C.D
1   1   2

2) When reading the data using read.table or read.csv use the check.names=FALSE argument.

Compare:

Lines <- "A B,C D
1,2
3,4"
read.csv(text = Lines)
##   A.B C.D
## 1   1   2
## 2   3   4
read.csv(text = Lines, check.names = FALSE)
##   A B C D
## 1   1   2
## 2   3   4

however, in this example it still leaves a name that would have to be quoted in sqldf since the names have embedded spaces.

3) To simply remove the periods, if DF is a data frame:

names(DF) <- gsub(".", "", names(DF), fixed = TRUE)

or it might be nicer to convert the periods to underscores so that it is reversible:

names(DF) <- gsub(".", "_", names(DF), fixed = TRUE)

This last line could be alternatively done like this:

names(DF) <- chartr(".", "_", names(DF))

answered Sep 18 '22 17:09

G. Grothendieck

Related questions
                            
                                How to save output from ggforce::facet_grid_paginate in only one pdf?
                            
                                Find all combinations of a set of numbers that add up to a certain total
                            
                                Euclidean distance calculations in R not making sense
                            
                                Convert string to date, format: "dd.mm.yyyy"
                            
                                count unique combinations of values
                            
                                Split on first comma in string
                            
                                How to find highest value in a data frame?
                            
                                R rbind error row.names duplicates not allowed
                            
                                R- delete accents in string
                            
                                Negation `!` in a dplyr pipeline `%>%`
                            
                                How to create lag variables
                            
                                How expand ggplot bar scale on one side but not the other without manual limits
                            
                                Error in dev.off() : cannot shut down device 1 (the null device)
                            
                                Avoid two for loops in R
                            
                                changing default environment for assignment of new variables
                            
                                R package caret confusionMatrix with missing categories
                            
                                Adjacency matrix in R
                            
                                C5.0 decision tree - c50 code called exit with value 1
                            
                                How to repeat the Grubbs test and flag the outliers
                            
                                Working with pairs of related columns (dplyr, tidyr, data.table)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to remove '.' from column names in a dataframe?

Tags:

r

Amit Singh Parihar

People also ask

2 Answers

blakiseskream

G. Grothendieck

Recent Activity

Donate For Us