Given an R list, I wish to find the index of a given list entry. For example, for entry "36"
, I want my output to be "2"
. Also, how could I do such queries in parallel using lapply?
> list
$`1`
[1] "7" "12" "26" "29"
$`2`
[1] "11" "36"
$`3`
[1] "20" "49"
$`4`
[1] "39" "41"
match() function basically returns the vector of indexes that satisfies the argument given in the match() function. Example 1: In our case, we first create the vector of values (0,1,2,3,4,5,6,7,8,9), and then we try to get the index value of the element “5” with the help of the match() function.
For example, if we have a data frame called df that contains a value say Data then we can find the row and column index of Data by using the command as which(df=="Data",arr. ind=TRUE).
To access items of a vector in R programming, use index notation with square brackets as vector[index]. For index, we may provide a single value that specifies the position of the item, or we may provide a vector of positions.
Accessing List Elements. Elements of the list can be accessed by the index of the element in the list. In case of named lists it can also be accessed using the names.
Here's a one-liner that allows for the (likely?) possibility that more than one element of the list will contain the string for which you're searching:
## Some example data
ll <- list(1:4, 5:6, 7:12, 1:12)
ll <- lapply(ll, as.character)
which(sapply(ll, FUN=function(X) "12" %in% X))
# [1] 3 4
You could first turn your list into a data.frame that maps values to their corresponding index in the list:
ll <- list(c("7", "12", "26", "29"),
c("11", "36"),
c("20", "49"),
c("39", "41"))
df <- data.frame(value = unlist(ll),
index = rep(seq_along(ll), lapply(ll, length)))
df
# value index
# 1 7 1
# 2 12 1
# 3 26 1
# 4 29 1
# 5 11 2
# 6 36 2
# 7 20 3
# 8 49 3
# 9 39 4
# 10 41 4
Then, write a function using match
for finding the index of the first occurrence of a given value:
find.idx <- function(val)df$index[match(val, df$value)]
You can call this function on a single value, or many at a time since match
is vectorized:
find.idx("36")
# [1] 2
find.idx(c("36", "41", "99"))
# [1] 2 4 NA
Of course, you can also run it through lapply
, especially if you plan to run it in parallel:
lapply(c("36", "41", "99"), find.idx)
# [[1]]
# [1] 2
#
# [[2]]
# [1] 4
#
# [[3]]
# [1] NA
For running this last bit in parallel, there are many, many options. I would recommend you weigh your options by searching through http://cran.r-project.org/web/views/HighPerformanceComputing.html.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With