Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change the class from factor to numeric of many columns in a data frame

Tags:

r

What is the quickest/best way to change a large number of columns to numeric from factor?

I used the following code but it appears to have re-ordered my data.

> head(stats[,1:2])   rk                 team 1  1 Washington Capitals* 2  2     San Jose Sharks* 3  3  Chicago Blackhawks* 4  4     Phoenix Coyotes* 5  5   New Jersey Devils* 6  6   Vancouver Canucks*  for(i in c(1,3:ncol(stats))) {     stats[,i] <- as.numeric(stats[,i]) }  > head(stats[,1:2])   rk                 team 1  2 Washington Capitals* 2 13     San Jose Sharks* 3 24  Chicago Blackhawks* 4 26     Phoenix Coyotes* 5 27   New Jersey Devils* 6 28   Vancouver Canucks* 

What is the best way, short of naming every column as in:

df$colname <- as.numeric(ds$colname) 
like image 626
Btibert3 Avatar asked Sep 26 '10 01:09

Btibert3


People also ask

How do you make multiple columns as numeric?

Use the lapply() Function to Convert Multiple Columns From Integer to Numeric Type in R. Base R's lapply() function allows us to apply a function to elements of a list. We will apply the as. numeric() function.

How do I convert a factor to a numeric in R DataFrame?

We must first convert the factor vector to a character vector, then to a numeric vector. This ensures that the numeric vector contains the actual numeric values instead of the factor levels.

How do I make all columns in a DataFrame numeric?

To convert columns of an R data frame from integer to numeric we can use lapply function. For example, if we have a data frame df that contains all integer columns then we can use the code lapply(df,as. numeric) to convert all of the columns data type into numeric data type.

How do I change a class to a numeric factor in R?

Converting Numeric value to a Factor For converting a numeric into factor we use cut() function. cut() divides the range of numeric vector(assume x) which is to be converted by cutting into intervals and codes its value (x) according to which interval they fall.


2 Answers

You have to be careful while changing factors to numeric. Here is a line of code that would change a set of columns from factor to numeric. I am assuming here that the columns to be changed to numeric are 1, 3, 4 and 5 respectively. You could change it accordingly

cols = c(1, 3, 4, 5);     df[,cols] = apply(df[,cols], 2, function(x) as.numeric(as.character(x))); 
like image 137
Ramnath Avatar answered Sep 25 '22 06:09

Ramnath


Further to Ramnath's answer, the behaviour you are experiencing is that due to as.numeric(x) returning the internal, numeric representation of the factor x at the R level. If you want to preserve the numbers that are the levels of the factor (rather than their internal representation), you need to convert to character via as.character() first as per Ramnath's example.

Your for loop is just as reasonable as an apply call and might be slightly more readable as to what the intention of the code is. Just change this line:

stats[,i] <- as.numeric(stats[,i]) 

to read

stats[,i] <- as.numeric(as.character(stats[,i])) 

This is FAQ 7.10 in the R FAQ.

HTH

like image 22
Gavin Simpson Avatar answered Sep 22 '22 06:09

Gavin Simpson