Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reclassify select columns in Data Table

Tags:

r

data.table

I wish to change the class of selected variables in a data table, using a vectorized operation. I am new to the data.table syntax, and am trying to learn as much as possible. I now the question is basic, but it will help me to better understand the data table way of thinking!

A similar question was asked here! However, the solution seems to pertain to either reclassing just one column or all columns. My question is unique to a select few columns.

### Load package
require(data.table)

### Create pseudo data
data <- data.table(id     = 1:10,
                   height = rnorm(10, mean = 182, sd = 20),
                   weight = rnorm(10, mean = 160, sd = 10),
                   color  = rep(c('blue', 'gold'), times = 5))

### Reclass all columns
data <- data[, lapply(.SD, as.character)]

### Search for columns to be reclassed
index <- grep('(id)|(height)|(weight)', names(data))

### data frame method
df <- data.frame(data)
df[, index] <- lapply(df[, index], as.numeric)

### Failed attempt to reclass columns used the data.table method
data <- data[, lapply(index, as.character), with = F]

Any help would be appreciated. My data are large and so using regular expressions to create a vector of column numbers to reclassify is necessary.

Thank you for your time.

like image 937
Andreas Avatar asked Apr 25 '13 21:04

Andreas


People also ask

How do I change the column type in R?

You can change data types using as. * where * is the datatype to change to, the other way is using class(). class(df$var) = "Numeric".

How do I change a Dataframe to numeric in R?

To convert columns of an R data frame from integer to numeric we can use lapply function. For example, if we have a data frame df that contains all integer columns then we can use the code lapply(df,as. numeric) to convert all of the columns data type into numeric data type.


2 Answers

You could avoid the overhead of the construction of .SD within j by using set

for(j in index) set(data, j =j ,value = as.character(data[[j]]))
like image 56
mnel Avatar answered Sep 30 '22 05:09

mnel


I think that @SimonO101 did most of the Job

data[, names(data)[index] := lapply(.SD, as.character) , .SDcols = index ]

You can just use the := magic

like image 30
dickoa Avatar answered Sep 30 '22 03:09

dickoa