I implemented binary tree data structure in Haskell.
My code:
module Data.BTree where
data Tree a = EmptyTree
| Node a (Tree a) (Tree a)
deriving (Eq, Ord, Read, Show)
emptyTree :: a -> Tree a
emptyTree a = Node a EmptyTree EmptyTree
treeInsert :: (Ord a) => a -> Tree a -> Tree a
treeInsert x EmptyTree = emptyTree x
treeInsert x (Node a left right)
| x == a = (Node x left right)
| x < a = (Node a (treeInsert x left) right)
| x > a = (Node a left (treeInsert x right))
fillTree :: Int -> Tree Int -> Tree Int
fillTree 10000 tree = tree
fillTree x tree = let a = treeInsert x tree
in fillTree (x + 1) a
This code very slow. I run:
fillTree 1 EmptyTree
I get : 50.24 secs
I try to implement this code in C language and my result of this test: 0m0.438s
Why so big difference? Is Haskell code rely so slow or my binary tree in haskell bad? I want to ask haskell guru maybe i can make my binary tree implementation more effective?
Thank you.
First, another data point: The Set
data structure in the Data.Set
module happens to be a binary tree. I've translated your fillTree
function to use it, instead:
import qualified Data.Set as Set
import Data.Set (Set)
fillSet :: Int -> Set Int -> Set Int
fillSet 10000 set = set
fillSet x set = let a = Set.insert x set
in fillSet (x + 1) a
Running fillSet 1 Set.empty
in GHCi, including a bit of extra computation to be sure that the entire result is evaluated, runs with no perceptible delay. So, this seems to indicate that the problem lies in your implementation.
To start with, I suspect the biggest difference between using Data.Set.Set
vs. your implementation is that if I'm reading your code correctly, you're not actually testing a binary tree. You're testing an over-complicated linked list--i.e., a maximally unbalanced tree--as a result of inserting elements in increasing order. Data.Set.Set
uses a balanced binary tree, which handles the pathological input better in this case.
We can also look at the definition of Set
:
data Set a = Tip
| Bin {-# UNPACK #-} !Size a !(Set a) !(Set a)
Without going into too much detail, what this says is that tracks the size of the tree, and avoids a few less-than-useful layers of indirection that would otherwise exist in the data type.
The full source of the Data.Set
module can be found here; you may find it enlightening to study.
A few more observations, to demonstrate the difference between different ways of running it. I added the following to your code:
toList EmptyTree = []
toList (Node x l r) = toList l ++ [x] ++ toList r
main = print . sum . toList $ fillTree 1 EmptyTree
This traverses the tree, sums the elements, and prints the total, which should ensure that everything is forced. My system is probably somewhat unusual so you may get rather different results trying this yourself, but relative differences should be accurate enough. Some results:
Using runhaskell
, which should be roughly equivalent to running it in GHCi:
real 1m36.055s
user 0m0.093s
sys 0m0.062s
Building with ghc --make -O0
:
real 0m3.904s
user 0m0.030s
sys 0m0.031s
Building with ghc --make -O2
:
real 0m1.765s
user 0m0.015s
sys 0m0.030s
Using my equivalent function based on Data.Set
instead:
Using runhaskell
:
real 0m0.521s
user 0m0.031s
sys 0m0.015s
Using ghc --make -O2
:
real 0m0.183s
user 0m0.015s
sys 0m0.031s
And the moral of today's story is: Evaluating expressions in GHCi and timing them with a stopwatch is a very, very bad way to test the performance of your code.
I doubt you implemented the same code in C. You probably used a non-persistent tree structure instead. That means you're comparing an O(n^2) algorithm in Haskell to an O(n) algorithm in C. Nevermind, the specific case you're using would be O(n^2) with a persistent structure or not. There's just a lot more allocation with the persistent structure, so it's not a fundamental algorithmic difference.
Additionally, it looks like you ran this from ghci. That 'i' in "ghci" means "interpreter". And yes, the interpreter can be tens or hundreds of times slower than compiled code. Try compiling it with optimizations and running it. I suspect it'll still be slower due to fundamental algorithmic differences, but it won't be near 50 seconds.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With