Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change multiple columns' data type in R?

I'd like to convert all fields ending with _FL from character to numeric. I thought this code will work, but is does not: all these fields are filled up with NAs. What's wrong with it?

library(data.table)
#s = fread('filename.csv',header = TRUE,sep = ";",dec = ".")
s=data.table(ID=(1:10), B=rnorm(10), C_FL=c("I","N"), D_FL=(0:1), E_FL=c("N","I"))
cn=colnames(s)
# Change all fields ending with _FL from "N"/"I" to numeric 0/1
for (i in cn){
  if(substr(i,nchar(i)-2,nchar(i))=='_FL'){
    s[,i] = as.numeric(gsub("I",1,gsub("N",0,s[,i])))
  }
}
like image 765
lmocsi Avatar asked Oct 12 '25 02:10

lmocsi


1 Answers

Another option is to find the character columns which contain "_FL" by using intersect(), and convert these to binary columns based on the condition == "N":

library(data.table)

# Find relevant columns
chr.cols <- names(s)[intersect(which(sapply(s,is.character)), 
                           grep("_FL", names(s)))]
# Convert to numeric
for(col in chr.cols) set(s, j = col, value = as.numeric(s[[col]] == "N"))

# See result
> s
    ID          B C_FL D_FL E_FL
 1:  1  0.6175364    0    0    1
 2:  2 -0.9500318    1    1    0
 3:  3 -0.6341547    0    0    1
 4:  4 -0.8055696    1    1    0
 5:  5 -0.3139938    0    0    1
 6:  6  0.4676558    1    1    0
 7:  7  1.6455591    0    0    1
 8:  8 -0.4544377    1    1    0
 9:  9  0.3512442    0    0    1
10: 10  0.3828367    1    1    0
like image 92
mtoto Avatar answered Oct 14 '25 18:10

mtoto