Are there any examples of dataset in base R that contain missing values? I've been looking through each one in turn and also searched using google-nothing so far.
library(MASS)
data()
Edit: I know how to add missing values to a dataset in R, I just want to know if any such datasets exist.
In R, the easiest way to find columns that contain missing values is by combining the power of the functions is.na() and colSums(). First, you check and count the number of NA's per column. Then, you use a function such as names() or colnames() to return the names of the columns with at least one missing value.
To remove observations with missing values in at least one column, you can use the na. omit() function. The na. omit() function in the R language inspects all columns from a data frame and drops rows that have NA's in one or more columns.
In R, missing values are represented by the symbol NA (not available). Impossible values (domain errors like division by 0 et logs of negative numbers are represented by the symbol NaN (Not-A-Number). NA is used for both numeric and string data.
airquality
is in base and has some NAs in it
> summary(airquality)
Ozone Solar.R Wind Temp Month Day
Min. : 1.00 Min. : 7.0 Min. : 1.700 Min. :56.00 Min. :5.000 Min. : 1.0
1st Qu.: 18.00 1st Qu.:115.8 1st Qu.: 7.400 1st Qu.:72.00 1st Qu.:6.000 1st Qu.: 8.0
Median : 31.50 Median :205.0 Median : 9.700 Median :79.00 Median :7.000 Median :16.0
Mean : 42.13 Mean :185.9 Mean : 9.958 Mean :77.88 Mean :6.993 Mean :15.8
3rd Qu.: 63.25 3rd Qu.:258.8 3rd Qu.:11.500 3rd Qu.:85.00 3rd Qu.:8.000 3rd Qu.:23.0
Max. :168.00 Max. :334.0 Max. :20.700 Max. :97.00 Max. :9.000 Max. :31.0
NA's :37 NA's :7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With