I have a dataframe for which I've calculated and added a difftime
column:
name amount 1st_date 2nd_date days_out
JEAN 318.5 1971-02-16 1972-11-27 650 days
GREGORY 1518.5 <NA> <NA> NA days
JOHN 318.5 <NA> <NA> NA days
EDWARD 318.5 <NA> <NA> NA days
WALTER 518.5 1971-07-06 1975-03-14 1347 days
BARRY 1518.5 1971-11-09 1972-02-09 92 days
LARRY 518.5 1971-09-08 1972-02-09 154 days
HARRY 318.5 1971-09-16 1972-02-09 146 days
GARRY 1018.5 1971-10-26 1972-02-09 106 days
I want to break it out and take subtotals where days_out is 0-60, 61-90, 91-120, 121-180.
For some reason I can't even reliably write bracket notation. I would expect
members[members$days_out<=120, ] to show just Barry and Garry, but I get a whole lot of lines like:
NA.1095 <NA> NA <NA> <NA> NA days
NA.1096 <NA> NA <NA> <NA> NA days
NA.1097 <NA> NA <NA> <NA> NA days
Those don't exist in the original data. There's no one without a name. What am I doing wrong here?
To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).
A missing value is one whose value is unknown. Missing values are represented in R by the NA symbol.
The na. omit R function removes all incomplete cases of a data object (typically of a data frame, matrix or vector). The syntax above illustrates the basic programming code for na.
Firstly, we use brackets with complete. cases() function to exclude missing values in R. Secondly, we omit missing values with na. omit() function.
This is standard behavior for <
and other relational operators: when asked to evaluate whether NA
is less than (or greater than, or equal to, or ...) some other number, they return NA
, rather than TRUE
or FALSE
.
Here's an example that should make clear what is going on and point to a simple fix.
x <- c(1, 2, NA, 4, 5)
x[x < 3]
# [1] 1 2 NA
x[x < 3 & !is.na(x)]
# [1] 1 2
To see why all of those rows indexed by NA
's have row.names like NA.1095
, NA.1096
, and so on, try this:
data.frame(a=1:2, b=1:2)[rep(NA, 5),]
# a b
# NA NA NA
# NA.1 NA NA
# NA.2 NA NA
# NA.3 NA NA
# NA.4 NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With