Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

assign to is.na(clinical.trial$age)

I am looking at the code from here which has this at the beginning:

## generate data for medical example 
clinical.trial <-
    data.frame(patient = 1:100,
               age = rnorm(100, mean = 60, sd = 6),
               treatment = gl(2, 50,
                 labels = c("Treatment", "Control")),
               center = sample(paste("Center", LETTERS[1:5]), 100, replace = 
TRUE))

## set some ages to NA (missing) 
is.na(clinical.trial$age) <- sample(1:100, 20)

I cannot understand this last line. The LHS is a vector of all FALSE values. The RHS is a vector of 20 numbers selected from the vector 1:100. I don't understand this kind of assignment. How is this result in clinical.trial$age getting some NA values? Does this kind of assignment have a name? At best I would say that the boolean vector on the RHS gets numbers assigned to it with recycling.

like image 708
matt Avatar asked Jun 14 '17 12:06

matt


1 Answers

is.na(x) <- value is translated as 'is.na<-'(x, value).

You can think of 'is.na<-'(x, value) as 'assign NA to x, at position value'.

A perhaps better and intuitive phrasing could be assign_NA(to = x, pos = value).


Regarding other similar function, we can find those in the base package:

x <- as.character(lsf.str("package:base"))
x[grep('<-', x)]
#>  [1] "$<-"                     "$<-.data.frame"         
#>  [3] "@<-"                     "[[<-"                   
#>  [5] "[[<-.data.frame"         "[[<-.factor"            
#>  [7] "[[<-.numeric_version"    "[<-"                    
#>  [9] "[<-.data.frame"          "[<-.Date"               
#> [11] "[<-.factor"              "[<-.numeric_version"    
#> [13] "[<-.POSIXct"             "[<-.POSIXlt"            
#> [15] "<-"                      "<<-"                    
#> [17] "attr<-"                  "attributes<-"           
#> [19] "body<-"                  "class<-"                
#> [21] "colnames<-"              "comment<-"              
#> [23] "diag<-"                  "dim<-"                  
#> [25] "dimnames<-"              "dimnames<-.data.frame"  
#> [27] "Encoding<-"              "environment<-"          
#> [29] "formals<-"               "is.na<-"                
#> [31] "is.na<-.default"         "is.na<-.factor"         
#> [33] "is.na<-.numeric_version" "length<-"               
#> [35] "length<-.factor"         "levels<-"               
#> [37] "levels<-.factor"         "mode<-"                 
#> [39] "mostattributes<-"        "names<-"                
#> [41] "names<-.POSIXlt"         "oldClass<-"             
#> [43] "parent.env<-"            "regmatches<-"           
#> [45] "row.names<-"             "row.names<-.data.frame" 
#> [47] "row.names<-.default"     "rownames<-"             
#> [49] "split<-"                 "split<-.data.frame"     
#> [51] "split<-.default"         "storage.mode<-"         
#> [53] "substr<-"                "substring<-"            
#> [55] "units<-"                 "units<-.difftime"

All works the same in the sense that 'fun<-'(x, val) is equivalent to fun(x) <- val. But after that they all behave like any normal functions.


R manuals: 3.4.4 Subset assignment

like image 191
GGamba Avatar answered Nov 12 '22 00:11

GGamba