I'm porting an algorithm from Java to Scala that does a range search on a VP Tree. Briefly, the nodes in the tree have coordinates in space and a radius: nodes within that radius can be found on the left subtree, whilst nodes outside that radius are found on the right subtree. A range search attempts to find all objects in the tree within a specified distance to a query object.
In Java I passed the function an arraylist in which it accumulated results, possibly recursing down one of either or both subtrees. Here's a straight port into Scala:
def search(node: VPNode[TPoint, TObject], query: TPoint, radius: Double,
results: collection.mutable.Set[TObject]) {
var dist = distance(query, node.point)
if (dist < radius)
results += node.obj
if (node.left != null && dist <= radius + node.radius)
search(node.left, query, radius, results)
if (node.right != null && dist >= radius + node.radius)
search(node.right, query, radius, results)
}
Scala's default collection types are immutable, and I was thinking it was a bit annoying having to type collection.mutable.
all the time, so I started looking into it. It seems to be recommended that using the immutable collections are nearly always fine: I'm using this code to do millions of lookups though, and it seems to me that copying and concatenating the results array would slow it down.
Answers like this for example suggest that the problem needs to be approached more 'functionally'.
So, what should I do to solve this problem in a more Scala-esque fashion? Ideally I'd like it to be as fast as the Java version, but I'm interested in solutions regardless (and can always profile them to see if it makes much difference).
Note, I only just started learning Scala (figured I may as well cut my teeth on something useful) but I'm not new to functional programming, having used Haskell before (although I don't think I'm that good at it!).
Even though the static type of such a collection provides no operations for modifying the collection, it might still be possible that the run-time type is a mutable collection which can be changed by other clients. By default, Scala always picks immutable collections.
Instead of mutating the state, create new state with computed value and return it to the user. In the same way, check for funds and create new state and return it to the user. In functional programming it is very common to create new state instead of trying to mutate the old state.
An object whose state cannot change after it has been constructed is called immutable (unchangable). The methods of an immutable object do not modify the state of the object. In Scala, all number types, strings, and tuples are immutable.
They are supported by class immutable. HashMap. Their representation is similar to vectors in that they are also trees where every node has 32 elements or 32 subtrees. But the selection of these keys is now done based on hash code.
This is what I would consider a more functional approach:
val emptySet = Set[TObject]()
def search(node: VPNode[TPoint, TObject], query: TPoint, radius: Double): Set[TObject] = {
val dist = distance(query, node.point)
val left = Option(node.left) // avoid nulls
.filter(_ => dist <= radius + node.radius) // do nothing if predicate fails
.fold(emptySet)(l => search(l, query, radius)) // continue your search
val right = Option(node.right)
.filter(_ => dist >= radius + node.radius)
.fold(emptySet)(r => search(r, query, radius))
left ++ right ++ (if (dist < radius) Set(node.obj) else emptySet)
}
Instead of passing around your mutable.Set
to each search
function, the search
function returns a Set[TObject]
which it then concatenates onto other sets. If you were to build up the function calls, it would look like each node of your tree was being concatenated with each other (assuming they were in your radius).
From an efficiency perspective, this is probably not as efficient as the mutable version. Using a List
instead of a Set
would probably be better, and then you can convert the final List
to a Set
when you're done (though still probably not as fasts as the mutable version).
UPDATE To answer your question about the benefits:
Option
/filter
/fold
can look a bit strange, but after you start using them for awhile (just like anything) it becomes easy to read. I would compare this to being able to read LINQ in .NET.List
you should get equal if not better performance from your original version. It appears that you don't really need to use Set
which has the overhead of making sure everything in the set is unique.Since you seem fairly interested, I would recommend reading Functional Programming in Scala. It goes over all of these basics in what I think is a great way for beginners.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With