Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting from a character to a numeric data frame

Tags:

dataframe

r

I have a character data frame in R which has NaNs in it. I need to remove any row with a NaN and then convert it to a numeric data frame.

If I just do as.numeric on the data frame, I run into the following

Error: (list) object cannot be coerced to type 'double'
 1:
 0:
like image 784
ganesh reddy Avatar asked Feb 05 '13 21:02

ganesh reddy


2 Answers

As @thijs van den bergh points you to,

dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)

dat <- as.data.frame(sapply(dat, as.numeric)) #<- sapply is here

dat[complete.cases(dat), ]
#  x y
#2 2 3

Is one way to do this.

Your error comes from trying to make a data.frame numeric. The sapply option I show is instead making each column vector numeric.

like image 184
user1317221_G Avatar answered Oct 18 '22 22:10

user1317221_G


Note that data.frames are not numeric or character, but rather are a list which can be all numeric columns, all character columns, or a mix of these or other types (e.g.: Date/logical).

dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)
is.list(dat)
# [1] TRUE

The example data just has two character columns:

> str(dat)
'data.frame':   2 obs. of  2 variables:
 $ x: chr  "NaN" "2"
 $ y: chr  "NaN" "3

...which you could add a numeric column to like so:

> dat$num.example <- c(6.2,3.8)
> dat
    x   y num.example
1 NaN NaN         6.2
2   2   3         3.8
> str(dat)
'data.frame':   2 obs. of  3 variables:
 $ x          : chr  "NaN" "2"
 $ y          : chr  "NaN" "3"
 $ num.example: num  6.2 3.8

So, when you try to do as.numeric R gets confused because it is wondering how to convert this list object which may have multiple types in it. user1317221_G's answer uses the ?sapply function, which can be used to apply a function to the individual items of an object. You could alternatively use ?lapply which is a very similar function (read more on the *apply functions here - R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate )

I.e. - in this case, to each column of your data.frame, you can apply the as.numeric function, like so:

data.frame(lapply(dat,as.numeric))

The lapply call is wrapped in a data.frame to make sure the output is a data.frame and not a list. That is, running:

lapply(dat,as.numeric)

will give you:

> lapply(dat,as.numeric)
$x
[1] NaN   2

$y
[1] NaN   3

$num.example
[1] 6.2 3.8

While:

data.frame(lapply(dat,as.numeric))

will give you:

>  data.frame(lapply(dat,as.numeric))
    x   y num.example
1 NaN NaN         6.2
2   2   3         3.8
like image 9
thelatemail Avatar answered Oct 18 '22 22:10

thelatemail