Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a dataframe of factor to numeric?

Tags:

r

I have a data frame with all factor values

V1 V2 V3
 a  b  c
 c  b  a
 c  b  c
 b  b  a

How can I convert all the values in the data frame to a new one with numeric values (a to 1, b to 2, c to 3, etc ...)

like image 562
mamatv Avatar asked Jan 01 '16 15:01

mamatv


People also ask

How do you convert a factor to a numeric variable?

There are two steps for converting factor to numeric: Step 1: Convert the data vector into a factor. The factor() command is used to create and modify factors in R. Step 2: The factor is converted into a numeric vector using as. numeric().

How do you change a DataFrame to a numeric value?

To convert columns of an R data frame from integer to numeric we can use lapply function. For example, if we have a data frame df that contains all integer columns then we can use the code lapply(df,as. numeric) to convert all of the columns data type into numeric data type.

How do I convert a column of DataFrame to numeric in R?

To convert a column to numeric in R, use the as. numeric() function. The as. numeric() is a built-in R function that returns a numeric value or converts any value to a numeric value.


Video Answer


2 Answers

I would try:

> mydf[] <- as.numeric(factor(as.matrix(mydf)))
> mydf
  V1 V2 V3
1  1  2  3
2  3  2  1
3  3  2  3
4  2  2  1
like image 105
A5C1D2H2I1M1N2O1R2T1 Avatar answered Nov 04 '22 15:11

A5C1D2H2I1M1N2O1R2T1


Converting from factor to numericgives the integer values. But, if the factor columns have levels specified as c('b', 'a', 'c', 'd') or c('c', 'b', 'a'), the integer values will be in that order. Just to avoid that, we can specify the levels by calling the factor again (safer)

df1[] <- lapply(df1, function(x) 
                as.numeric(factor(x, levels=letters[1:3])))

If we are using data.table, one option would be to use set. It would be more efficient for large datasets. Converting to matrix may pose memory problems.

library(data.table)
setDT(df1)
for(j in seq_along(df1)){
 set(df1, i=NULL, j=j, 
     value= as.numeric(factor(df1[[j]], levels= letters[1:3])))
 }
like image 32
akrun Avatar answered Nov 04 '22 14:11

akrun