Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an R idiom for obtainig the index of the minimum element in a vector after filtering by a boolean index vector?

Suppose I have a vector such as

x <- c(7,2,8,1,6,5)

and a boolean vector such as

b <- c(TRUE,FALSE,FALSE,FALSE,TRUE,FALSE)

I want to find the index of the smallest element in x for which the corresponding element in b is TRUE. However, if I write

which.min(x[b])

it returns 2, because x[b] evaluates to c(7,6). Instead, I want to obtain 5, the corresponding index into the vector x prior to indexing by b. I can write

(1:6)[b][which.min(x[b])]

but that is not very readable! Is there a more readable way?

like image 257
Tom Dietterich Avatar asked May 18 '21 04:05

Tom Dietterich


People also ask

How do you find the index of an element in a vector in R?

Use the which() Function to Find the Index of an Element in R. The which() function returns a vector with the index (or indexes) of the element which matches the logical vector (in this case == ).

How do you select an element of a vector in R?

The way you tell R that you want to select some particular elements (i.e., a 'subset') from a vector is by placing an 'index vector' in square brackets immediately following the name of the vector. For a simple example, try x[1:10] to view the first ten elements of x.

What does true mean in R?

In R, true values are designated with TRUE, and false values with FALSE. When you index a vector with a logical vector, R will return values of the vector for which the indexing vector is TRUE.

How to find the index of element in vector in R?

In this article, we will discuss How to find the index of element in vector in the R programming language. We can find the index of the element by the following functions – which () function basically returns the vector of indexes that satisfies the argument given in the which () function.

Which function returns the vector of the index of the element?

So, they together used as a vector with the %in% and which function returns the vector of the index of both the elements. match () function basically returns the vector of indexes that satisfies the argument given in the match () function.

How many values are in a vector in R?

As you can see based on the previous R code, our example vector simply contains seven numeric values. Let’s assume that we want to know the index of the first element of our vector, which is equal to the value 1. Then we can apply the match R function as follows:

How do I find the Max and min values in R?

As you can see based on the previous outputs of the RStudio console, the max value is at position 3 and the min value is located at position 2 of our example vector. We can use a similar R syntax as in Example 1 to determine the row index of the max or min value of a data frame column.


4 Answers

After you do x[b], the resulting vector has no memory of the original indexes of the values. That information is lost. An alternative would be to alter the values for FALSE to be something very large. For example

which.min(ifelse(b, x, Inf))
# [1] 5

Another alternative is

which(b)[which.min(x[b])] 

Because which(b) is basicially the same as (1:6)[b]

like image 185
MrFlick Avatar answered Oct 21 '22 05:10

MrFlick


If you have unique values in x :

which(x == min(x[b]))
#[1] 5

If there could be duplicates in x :

which(x == min(x[b]) & b)
#[1] 5
like image 34
Ronak Shah Avatar answered Oct 21 '22 04:10

Ronak Shah


I suggest replace(),

which.min(replace(x, !b, NA))

which is similar to one of GKi's great solutions, but still works if all b are FALSE.

like image 39
jay.sf Avatar answered Oct 21 '22 05:10

jay.sf


Instead of subsetting x with b the FALSE positions could be set to NA using [<-.

which.min("[<-"(x, !b, NA))
#[1] 5

Alternatively it could also be set to e.g. Inf, as given in the answer from @mrflick, but this will limit the general applicability, to use for which.max it need to be set to -Inf and in case of no TRUE it will return an index.

The logical vector could be converted to indices by using which and those indices could be used for subsetting and be subseted, similar to the solution of @mrflick, but avoiding using which twice.

i <- which(b)
i[which.min(x[i])]
#[1] 5

In case the values in x are all positive you can divide by b what gives for the cases of b == FALSE Inf (and in case of negative x -Inf) - This way is not recommended.

which.min(x / b)
#[1] 5

Comparing with bench::mark:

n <- 1e6
set.seed(42)
x <- sample(0:99, n, TRUE)
b <- sample(c(TRUE,FALSE), n, TRUE)

bench::mark(which.min(ifelse(b, x, Inf))
, which(b)[which.min(x[b])]
#, which(x == min(x[b]))             #Result not equal to others
#, which(x == min(x[b]) & b)         #Result not equal to others
, which.min("[<-"(x, !b, NA))
, which.min("[<-"(x, !b, Inf))
, which.min(x / b)
)
#  expression                        min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
#  <bch:expr>                   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
#1 which.min(ifelse(b, x, Inf))  13.75ms  13.95ms      70.6   30.52MB     61.2    15    13      212ms
#2 which(b)[which.min(x[b])]      5.13ms    5.2ms     192.    19.07MB     92.5    58    28      303ms
#3 which.min(`[<-`(x, !b, NA))    3.58ms   3.67ms     271.    11.44MB     51.2   106    20      391ms
#4 which.min(`[<-`(x, !b, Inf))   4.85ms   4.96ms     200.    19.07MB    100.     50    25      250ms
#5 which.min(x/b)                 3.99ms   4.05ms     246.     7.63MB     22.6   109    10      442ms

b <- logical(n) #No True
bench::mark(#which.min(ifelse(b, x, Inf)) #Wrong result
  which(b)[which.min(x[b])]
, which.min("[<-"(x, !b, NA))
#, which.min("[<-"(x, !b, Inf))           #Wrong result
#, which.min(x / b)                       #Wrong result
)
#  expression                     min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result
#  <bch:expr>                  <bch:> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>
#1 which(b)[which.min(x[b])]   1.18ms 1.21ms      826.    7.63MB     72.9   340    30      412ms <int …
#2 which.min(`[<-`(x, !b, NA)) 5.36ms 5.49ms      181.   15.26MB     38.3    71    15      392ms <int …

b <- !logical(n) #All True
bench::mark(which(b)[which.min(x[b])]
, which.min("[<-"(x, !b, NA))
)
#  expression                     min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result
#  <bch:expr>                  <bch:> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>
#1 which(b)[which.min(x[b])]   5.06ms 5.25ms      184.    19.1MB     92.0    54    27      293ms <int …
#2 which.min(`[<-`(x, !b, NA)) 3.59ms 3.81ms      261.    11.4MB     48.5   102    19      391ms <int …
like image 31
GKi Avatar answered Oct 21 '22 04:10

GKi