Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Can I vectorize this function to return an index vector?

Tags:

r

I'm new to R and am trying to get a handle on the apply family of functions. Specifically, I am trying to write a higher-order function that will accept 2 character vectors, "host", and "guest" (which do not need to be the same length) and return me an index vector the same length as "host", with the resulting elements corresponding to their indices in guest (NA if not there).

host <- c("A","B","C","D")
guest <- c("D","C","A","F")

matchIndices <- function(x,y)
{
  return(match(x,y))
}

This code returns 3 as expected:

matchIndices(host[1],guest)

This is the loop I'd like to be able to replace with a succinct apply function (sapply?)

for (i in 1:length(host)) 
{ idx <- matchIndices(host[i],guest); 
  cat(paste(idx,host[i],"\n",sep=";"))
}

This code "works" in that it produces the output below, but I really want the result to be a vector, and I have a hunch that one of the apply functions will do the trick. I'm just stuck on how to write it. Any help would be most appreciated. Thanks.

3;A; NA;B; 2;C; 1;D;

like image 230
user297400 Avatar asked Jan 19 '11 15:01

user297400


People also ask

How to index a vector in R?

Vector elements are accessed using indexing vectors, which can be numeric, character or logical vectors. You can access an individual element of a vector by its position (or "index"), indicated using square brackets. In R, the first element has an index of 1. To get the 7th element of the colors vector: colors[7] .

What is a vectorized function?

Vectorized functions usually refer to those that take a vector and operate on the entire vector in an efficient way. Ultimately this will involve some form of loop, but as that loop is being performed in a low-level language such as C it can be highly efficient and tailored to the particular task.

Is R 1 indexed?

In R, array indexes start at 1 - the 1st element is at index 1. This is different than 0-based languages like C, Python, or Java where the first element is at index 0.


2 Answers

host <- c("A","B","C","D")
guest <- c("D","C","A","F")

matchIndices <- function(x,y) {
    return(match(x,y))
}

One (inefficient) way is to sapply over the host vector, passing in guest as an argument (note you could just simplify this to sapply(host, match, guest) but this illustrates a general way of approaching this sort of thing):

> sapply(host, matchIndices, guest)
 A  B  C  D 
 3 NA  2  1

However, this can be done directly using match as it accepts a vector first argument:

> match(host, guest)
[1]  3 NA  2  1

If you want a named vector as output,

> matched <- match(host, guest)
> names(matched) <- host
> matched
 A  B  C  D 
 3 NA  2  1

which could be wrapped into a function

matchIndices2 <- function(x, y) {
    matched <- match(x, y)
    names(matched) <- x
    return(matched)
}

returning

> matchIndices2(host, guest)
 A  B  C  D 
 3 NA  2  1

If you really want the names and the matches stuck together into a vector of strings, then:

> paste(match(host, guest), host, sep = ";")
[1] "3;A"  "NA;B" "2;C"  "1;D"
like image 145
Gavin Simpson Avatar answered Sep 30 '22 17:09

Gavin Simpson


if you want the output vector in the host;guestNum format you would use do.call, paste, match as follows:

> do.call(paste, list(host, sapply(host, match, guest), sep = ';'))                                                                                     
[1] "A;3"  "B;NA" "C;2"  "D;1" 
like image 34
Prasad Chalasani Avatar answered Sep 30 '22 17:09

Prasad Chalasani