I'm working on a large application that constructs a lot of Data.Text
values on the fly. I've been building all my Text
values using (<>)
and Data.Text.concat
.
I only recently learned of the existence of the Builder
type. The Beginning Haskell book has this to say about it:
Every time two elements are concatenated, a new
Text
value has to be created, and this comes with some overhead to allocate memory, to copy data, and also to keep track of the value and release it when it's no longer needed... Both thetext
andbytestring
packages provide aBuilder
data type that can be used to efficiently generate large text values. [pg 240]
However, the book doesn't give any indication of exactly what is meant by "large text values."
So, I'm wondering whether or not I should refactor my code to use Builder
. Maybe you can help me make that decision.
Specifically, I have these questions:
1) Are there any guidelines or "best practices" regarding when one should choose Builder
over concatenation? Or, how do I know that a given Text
value is "large" enough that it merits using Builder
?
2) Is using Builder
a "no brainer," or would it be worthwhile doing some profiling to confirm its benefits before undertaking a large-scale refactoring?
Thanks!
Data.Text.concat
is an O(n+m)
operation where n
and m
are the lengths of the strings you want to concat
. This is because a new memory buffer of size n + m
must be allocated to store the result of the concatenation.
Builder
is specifically optimized for the mappend
operation. It's a cheap O(1)
operation (function composition, which is also excellently optimized by GHC). With Builder
you are essentially building up the instructions for how to produce the final string result, but delaying the actual creation until you do some Builder -> Text
transformation.
To answer your questions, you should choose Builder
if you have profiled your application and discovered that Text.concat
are dominating the run time. This will obviously depend on your needs and application. There is no general rule for when you should use Builder
but for short Text
literals there is probably no need.
Profiling would definitely be worthwhile if using Builder
would involve "undertaking a large-scale refactoring". Although it goes without saying that Haskell will naturally make this kind of refactoring much less painful than you might be used to with less developer friendly languages, so it might not be such a difficult undertaking after all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With