Given a data.frame
containing columns of only strings (no factors), some of which should remain strings, some of which are integers, and some of which are doubles, how can I guess the most appropriate storage mode to which to convert the strings?
fixDf <- data.frame(isChar=c("A", "B", "C"),
isDouble=c("0.01", "0.02", "0.03"),
isInteger=c("1", "2", "3"), stringsAsFactors=FALSE)
I am wondering if there is an easy way to determine that the following needs to be done, and then to do it:
mode(fixDf[, "isDouble"]) <- "double"
mode(fixDf[, "isInteger"]) <- "integer"
Ideally, where errors are encountered a function to handle this would leave the data in its string form.
There are several ways to check data type in R. We can make use of the “typeof()” function, “class()” function and even the “str()” function to check the data type of an entire dataframe.
Everything in R is an object. R has 6 basic data types.
The columns represent variables, or the attributes of each case that were measured. When organizing data in a data frame, what does the row represent? Column? Shows you just the first few rows of a data frame.
you can use colwise
from the plyr
package and the type.convert
function.
library(plyr)
foo = colwise(type.convert)(fixDf)
str(foo)
'data.frame': 3 obs. of 3 variables:
$ isChar : Factor w/ 3 levels "A","B","C": 1 2 3
$ isDouble : num 0.01 0.02 0.03
$ isInteger: int 1 2 3
Or using base R:
as.data.frame(lapply(fixDf, type.convert))
type_convert
from readr does exactly what you want, operating on an entire data frame. It handles logical, numeric (integer and double), strings, and dates/times well, without coercing to factor.
type_convert(fixDf)
To parse columns individually, use parse_guess
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With