ggpairs
in the GGally
package seems pretty useful, but it appears to fail when there NA
is present anywhere in the data set:
#require(GGally)
data(tips, package="reshape")
pm <- ggpairs(tips[,1:3]) #works just fine
#introduce NA
tips[1,1] <- NA
ggpairs(tips[,1:3])
> Error in if (lims[1] > lims[2]) { : missing value where TRUE/FALSE needed
I don't see any documentation for dealing with NA
values, and solutions like ggpairs(tips[,1:3], na.rm=TRUE)
(unsurprisingly) don't change the error message.
I have a data set in which perhaps 10% of values are NA
, randomly scattered throughout the dataset. Therefore na.omit(myDataSet)
will remove much of the data. Is there any way around this?
Some functions of GGally
like ggparcoord()
support handling NAs by missing=[exclude,mean,median,min10,random]
parameter. However this is not the case for ggpairs()
unfortunately.
What you can do is to replace NAs with a good estimation of your data you were expecting ggpair()
will do automatically for you. There are good solutions like replacing them by row means, zeros, median or even closest point (Notice 4 hyperlinks on the words of the recent sentence!).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With