Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert type of multiple columns of a dataframe at once

I seem to spend a lot of time creating a dataframe from a file, database or something, and then converting each column into the type I wanted it in (numeric, factor, character etc). Is there a way to do this in one step, possibly by giving a vector of types ?

foo<-data.frame(x=c(1:10),                  y=c("red", "red", "red", "blue", "blue",                      "blue", "yellow", "yellow", "yellow",                      "green"),                 z=Sys.Date()+c(1:10))  foo$x<-as.character(foo$x) foo$y<-as.character(foo$y) foo$z<-as.numeric(foo$z) 

instead of the last three commands, I'd like to do something like

foo<-convert.magic(foo, c(character, character, numeric)) 
like image 923
PaulHurleyuk Avatar asked Oct 06 '11 21:10

PaulHurleyuk


People also ask

How do I change Dtype of multiple columns in pandas?

to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.

How do I change multiple columns in a data frame?

Way 1: Using rename() method Create a data frame with multiple columns. Create a dictionary and set key = old name, value= new name of columns header. Assign the dictionary in columns. Call the rename method and pass columns that contain dictionary and inplace=true as an argument.

How do I change the column type in pandas?

You can change the column type in pandas dataframe using the df. astype() method. Once you create a dataframe, you may need to change the column type of a dataframe for reasons like converting a column to a number format which can be easily used for modeling and classification.

How do I convert multiple columns to string?

Convert Multiple Columns to String You can also convert multiple columns to string by sending dict of column name -> data type to astype() method. The below example converts column Fee from int to string and Discount from float to string dtype. Yields below output.


2 Answers

Edit See this related question for some simplifications and extensions on this basic idea.

My comment to Brandon's answer using switch:

convert.magic <- function(obj,types){     for (i in 1:length(obj)){         FUN <- switch(types[i],character = as.character,                                     numeric = as.numeric,                                     factor = as.factor)         obj[,i] <- FUN(obj[,i])     }     obj }  out <- convert.magic(foo,c('character','character','numeric')) > str(out) 'data.frame':   10 obs. of  3 variables:  $ x: chr  "1" "2" "3" "4" ...  $ y: chr  "red" "red" "red" "blue" ...  $ z: num  15254 15255 15256 15257 15258 ... 

For truly large data frames you may want to use lapply instead of the for loop:

convert.magic1 <- function(obj,types){     out <- lapply(1:length(obj),FUN = function(i){FUN1 <- switch(types[i],character = as.character,numeric = as.numeric,factor = as.factor); FUN1(obj[,i])})     names(out) <- colnames(obj)     as.data.frame(out,stringsAsFactors = FALSE) } 

When doing this, be aware of some of the intricacies of coercing data in R. For example, converting from factor to numeric often involves as.numeric(as.character(...)). Also, be aware of data.frame() and as.data.frame()s default behavior of converting character to factor.

like image 196
joran Avatar answered Sep 25 '22 08:09

joran


If you want to automatically detect the columns data-type rather than manually specify it (e.g. after data-tidying, etc.), the function type.convert() may help.

The function type.convert() takes in a character vector and attempts to determine the optimal type for all elements (meaning that it has to be applied once per column).

df[] <- lapply(df, function(x) type.convert(as.character(x))) 

Since I love dplyr, I prefer:

library(dplyr) df <- df %>% mutate_all(funs(type.convert(as.character(.)))) 
like image 20
Luke Hankins Avatar answered Sep 22 '22 08:09

Luke Hankins