I'm using some Java code to do fast prefix lookups, using java.util.TreeSet, could I be using scala's TreeSet instead? Or a different solution?
/** A class that uses a TreeSet to do fast prefix matching
*/
class PrefixMatcher {
private val _set = new java.util.TreeSet[String]
def add(s: String) = _set.add(s)
def findMatches(prefix: String): List[String] = {
val matches = new ListBuffer[String]
val tailSet = _set.tailSet(prefix)
for ( tail <- tailSet.toArray ) {
val tailString = tail.asInstanceOf[String]
if ( tailString.startsWith(prefix) )
matches += tailString
else
return matches.toList
}
matches.toList
}
}
Use a Trie. Nobody's actually posted a Trie here yet, despite the fact that some people have posted sorted TreeMap data structures that they have misnamed as tries. Here is a fairly representative sample of a Trie implementation in Java. I don't know of any in Scala. See also an explanation of Tries on Wikipedia.
The from & takeWhile approach:
class PrefixMatcher {
private val _set = new TreeSet[String]
def add(s: String) = _set.add(s)
def findMatches(prefix: String): Iterable[String] =
_set from prefix takeWhile(_ startsWith prefix)
}
An alternative is to select a subset from prefix to prefix++ (the smallest string after the prefix). This selects only the range of the tree that actually starts with the given prefix. Filtering of entries is not necessary. The subSet method will create a view of the underlying set.
There's still some work (overflow and empty strings won't work) in the increment method but the intent should be clear.
class PrefixMatcher {
private val _set = new java.util.TreeSet[String]
def add(s: String) = _set.add(s)
def findMatches(prefix: String) : Set[String] = {
def inc(x : String) = { //ignores overflow
assert(x.length > 0)
val last = x.length - 1
(x take last) + (x(last) + 1).asInstanceOf[Char]
}
_set.subSet(prefix, inc(prefix))
}
}
The same works with the scala jcl.TreeSet#range implementation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With