Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Errata in the original paper on suffix arrays?

I'm looking at the pseudo-code given in figure 3 of the original paper introducing suffix arrays "SUFFIX ARRAYS: A NEW METHOD FOR ON-LINE STRING SEARCHES".

I can't figure out the logic for lines 4 and 5 (indexing from 0). The lines reads:

else if r < P or wr ≤ aPos[N-1]+rthen
LW ← N

W is the pattern of length 'P' we're looking for and r is lcp(A[pos[N-1]:], W). The problem is that in almost all cases, this lcp will be less than length of 'W'. This conditional is meant to handle the case (I think) that the pattern is lexicographically larger than the lexicographically greatest suffix in the array, but it doesn't test this at all. Lines 2 & 3, on the other hand, which test if W is less than the lexicographically smallest suffix seem to make perfect sense

if l = P or wl ≤ aPos[0]+lthen
LW ← 0

I believe that the original lines should read something like:

else if r < P and wr > aPos[N-1]+rthen
LW ← N

The only way that W can be greater than A[pos[N-1]:] is if it has an lcp shorter than the length of the pattern (otherwise, all of W matches and so W cannot be greater, only lesser or equal to the thing with which we're sharing the lcp) AND if the character after the lcp is greater in W than in A[pos[N-1]]. Does this seem to make sense? Is this an error in the original paper? If not, can someone please explain to me how I'm misinterpreting the original code?

like image 902
nomad Avatar asked Oct 09 '14 16:10

nomad


People also ask

What is suffix of an array?

In computer science, a suffix array is a sorted array of all suffixes of a string. It is a data structure used in, among others, full-text indices, data-compression algorithms, and the field of bibliometrics. Suffix array.

How do you make a suffix array?

A suffix array can be constructed from Suffix tree by doing a DFS traversal of the suffix tree. In fact Suffix array and suffix tree both can be constructed from each other in linear time. A simple method to construct suffix array is to make an array of all suffixes and then sort the array.


1 Answers

I think you understand the paper correct and in fact it has an error.

Consider the following example: let A = banana, W = nana. Then A[pos[N-1]:] = nana. Algorithm sets LW = N or even fails while in fact it's N-1.

like image 56
Attila Avatar answered Oct 10 '22 16:10

Attila