Algorithm to count occurrences of a matrix inside a larger one

Question

i'm facing a problem right now, i need to count the number of time a certain MxM matrix appears inside a NxN one (this one should be larger than the first one). Any hints on how to do this? I'm going to implement it in C and there is no option to change it.

Revision 1

Hi everyone, i would really like to say thank to all the answers and opinions on the matter. I should tell you that after many hours of hard work we have come to a solutions that is not strictly like the Boyer-Moore approach, but rather an algorithm on my own. I'm planning on publishing it once is tested and finished. The solutions is now being adapted to be paralelized for speed optimization using the university cluster with the C Library MPI.

Nemo · Accepted Answer

Hm, sounds like a 2-dimensional version of string matching. I wonder if there's a 2D version of Boyer-Moore?

A Boyer-Moore Approach for Two-Dimensional Matching

Ah, apparently there is. :-)

R.. GitHub STOP HELPING ICE · Answer

One approach that immediately came to mind:

Treat the big matrix as a linear string of N*N "characters" and the small matrix as a linear regular expression of length (M+1)*M with "literal characters" in the first M positions of each "row" and the equivalent of .{N-M} in the remaining position of each row. Compile the regular expression to a DFA and run it.

Time to find all matches is O(N*N). I suspect there are fancier algorithms that would work (adaptations of fancier 1-d substring search algorithms) but this one's not too bad.

j_random_hacker · Answer

Recently, fast algorithms have been developed for building 2D suffix trees. These can be used to find any square submatrix in a larger matrix in time independent of the size of the larger matrix:

http://www.springerlink.com/content/96511486841752h4/
http://www.springerlink.com/content/e65lt4r77773l29g/

You might need to be a subscriber to see those papers.

I also found this freely available one, which describes a randomised construction algorithm that is expected linear-time:

http://www.cs.nyu.edu/cs/faculty/cole/papers/CH00.ps

I expect that constructing a 2D suffix tree and searching with it would be faster if you need to search for many different submatrices in a single matrix, but it's probably slower than 2D Boyer-Moore if you'll be searching inside a different matrix each time.

Algorithm to count occurrences of a matrix inside a larger one

Tags:

c

algorithm

count

matrix

guiman

3 Answers

Nemo

R.. GitHub STOP HELPING ICE

j_random_hacker

Recent Activity

Donate For Us

Algorithm to count occurrences of a matrix inside a larger one

Tags:

c

algorithm

count

matrix

guiman

3 Answers

Nemo

R.. GitHub STOP HELPING ICE

j_random_hacker

Related questions

Recent Activity

Donate For Us