Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Number sequence recognition

Tags:

r

developing on from another question:

Identifying sequences of repeated numbers in R

I have used the answers from that question to identify sequences within my data, not a problem, however I am stuck when it comes to identifying sequences of differing numbers, for example: the sequence maybe: 126,126,25 rather than repetitive numbers,

The code I am currently using is the same as in the above question (rle)

sample data:

   d<-read.table(text='Date.Time Aerial
794  "2012-10-01 08:18:00"      1
795  "2012-10-01 08:34:00"      1
796  "2012-10-01 08:39:00"      1
797  "2012-10-01 08:42:00"      1
798  "2012-10-01 08:48:00"      1
799  "2012-10-01 08:54:00"      1
800  "2012-10-01 08:58:00"      1
801  "2012-10-01 09:04:00"      1
802  "2012-10-01 09:05:00"      1
803  "2012-10-01 09:11:00"      1
1576 "2012-10-01 09:17:00"      2
1577 "2012-10-01 09:18:00"      2
804  "2012-10-01 09:19:00"      1
805  "2012-10-01 09:20:00"      1
1580 "2012-10-01 09:21:00"      2
1581 "2012-10-01 09:23:00"      2
806  "2012-10-01 09:25:00"      1
807  "2012-10-01 09:32:00"      1
808  "2012-10-01 09:37:00"      1
809  "2012-10-01 09:43:00"      1', header=TRUE, stringsAsFactors=FALSE, row.names=1)

code that will recognise repeated sequence of numbers (same number repeated 4 times):

tmp <- rle(d$Aerial)
d$newCol <- rep(tmp$lengths>=4, times = tmp$lengths)

However I do not know how to identify a sequence which contains different numbers, for example the sequence may be: 1,2,2,1 (as in d$Aerial) at "2012-10-01 09:11:00"

There are various patterns. The data is detections of a signal at a given time on a given Aerial, but to keep the question open I have simplified it as above. so the pattern is 1,2,2,1 i.e. detection at Aerial 1, then 2, then, 2, then 1 (in the Aerial column). In my data when this pattern occurs it indicates a behavioural movement of an animal. If I am able to identify it, I can then perform more calculations on it.

The code above indicates when a number is repeated 4 times, however it is unable to identify repetition of 4 numbers which are different from each other: 1,2,2,1

This sequence (1,2,2,1) may come up multiple times in the data and I would like to identify it each time.

like image 651
Salmo salar Avatar asked Mar 10 '13 23:03

Salmo salar


People also ask

How do you solve a number sequence test?

A number sequences test contains number sequences which are given as finite sequences of numbers in certain patterns. To solve them, all you have to do is figure out the pattern and come up with the next logical number of the sequence. Sounds easy?

What is the sequence IQ test?

Number sequences are often part of numerical aptitude, psychometric and IQ tests to identify the respondent's ability with numbers. A number sequence consists of a number series, with one missing number that respondents need to logically identify.

What is number sequence test?

Number sequence tests, also called number series, are standardized psychometric assessment tests that provide the employing organization with information about a candidate's general ability to logically reason with numbers. In this test, you are to find the missing number in a given sequence.


2 Answers

Brute-force solution:

pat <- c(1,2,2,1)
x <- sapply(1:(nrow(d)-length(pat)), function(x) all(d$Aerial[x:(x+length(pat)-1)] == pat))

d[which(x),]  # "which" prevents recycling of the shorter vector "x"
##               Date.Time Aerial
## 803 2012-10-01 09:11:00      1
## 805 2012-10-01 09:20:00      1

zoo has rollapply which can be used for this:

require(zoo)
x <- rollapply(d$Aerial, length(pat), FUN=function(x) all(x == pat))

d[which(x),]
##               Date.Time Aerial
## 803 2012-10-01 09:11:00      1
## 805 2012-10-01 09:20:00      1

For the (now deleted) comment, to find the rows which match the final character of the pattern:

d[which(x)+length(pat)-1,]
##               Date.Time Aerial
## 804 2012-10-01 09:19:00      1
## 806 2012-10-01 09:25:00      1
like image 118
Matthew Lundberg Avatar answered Nov 07 '22 16:11

Matthew Lundberg


If you don't know what the patterns are going to be in advance (which is what I initially took from your question), then here's a brute force solution that will find repeated patterns of a given length:

pattern_length = 4
patterns = list()
for (i in 1:(nrow(d) - pattern_length)) {
  patterns[[i]] = d$Aerial[i:(i + pattern_length - 1)]
}
unique(patterns[duplicated(patterns)])

[[1]]
[1] 1 1 1 1

[[2]]
[1] 1 1 2 2

[[3]]
[1] 1 2 2 1

[[4]]
[1] 2 2 1 1

You could then feed these into Matthew Lundberg's answer.

like image 30
Marius Avatar answered Nov 07 '22 14:11

Marius