I have a 8 x n matrix, for instance
set.seed(12345)
m <- matrix(sample(1:50, 800, replace=T), ncol=8)
head(m)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 37 15 30 3 4 11 35 31
[2,] 44 31 45 30 24 39 1 18
[3,] 39 49 7 36 14 43 26 24
[4,] 45 31 26 33 12 47 37 15
[5,] 23 27 34 29 30 34 17 4
[6,] 9 46 39 34 8 43 42 37
I would like to find a certain pattern in the matrix, for instance I would like to know where I can find a 37, followed in the next line by a 10 and a 29 and the line after by a 42
This happens, for instance, in lines 57:59 of the above matrix
m[57:59,]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] *37 35 1 30 47 9 12 39
[2,] 5 22 *10 *29 13 5 17 36
[3,] 22 43 6 2 27 35 *42 50
A (probably inefficient) solution is to get all the lines containing 37 with
sapply(1:nrow(m), function(x){37 %in% m[x,]})
And then use a few loops to test the other conditions.
How could I write an efficient function to do this, that can be generalized to any user-given pattern (not necessarily over 3 lines, with possible "holes", with variable number of values in each line etc).?
EDIT: to answer various comments
37;10,29;42
Where ;
represents a new line and ,
separates values on the same line.
Similarly we may look for
50,51;;75;80,81
Meaning 50 and 51 in line n, 75 in line n+2, and 80 and 81 in line n+3
This reads easily and is hopefully generalizable enough for you:
has.37 <- rowSums(m == 37) > 0
has.10 <- rowSums(m == 10) > 0
has.29 <- rowSums(m == 29) > 0
has.42 <- rowSums(m == 42) > 0
lag <- function(x, lag) c(tail(x, -lag), c(rep(FALSE, lag)))
which(has.37 & lag(has.10, 1) & lag(has.29, 1) & lag(has.42, 2))
# [1] 57
Edit: here is a generalization that can use positive and negative lags:
find.combo <- function(m, pattern.df) {
lag <- function(v, i) {
if (i == 0) v else
if (i > 0) c(tail(v, -i), c(rep(FALSE, i))) else
c(rep(FALSE, -i), head(v, i))
}
find.one <- function(x, i) lag(rowSums(m == x) > 0, i)
matches <- mapply(find.one, pattern.df$value, pattern.df$lag)
which(rowSums(matches) == ncol(matches))
}
Tested here:
pattern.df <- data.frame(value = c(40, 37, 10, 29, 42),
lag = c(-1, 0, 1, 1, 2))
find.combo(m, pattern.df)
# [1] 57
Edit2: following the OP's edit regarding a GUI input, here is a function that transforms the GUI input into the pattern.df
my find.combo
function expects:
convert.gui.input <- function(string) {
rows <- strsplit(string, ";")[[1]]
values <- strsplit(rows, ",")
data.frame(value = as.numeric(unlist(values)),
lag = rep(seq_along(values), sapply(values, length)) - 1)
}
Tested here:
find.combo(m, convert.gui.input("37;10,29;42"))
# [1] 57
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With