Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which techniques have contributed the most to Haskell's improving performance? [closed]

Tags:

haskell

ghc

I learned a bit of Haskell at university back in the late 90s. At the time, though performance was adequate and much better than one would expect for such a high-level language, it still was nothing to write home about.

Things have changed. Haskell (GHC) today has great performance, often not far from C/C++. So, what exactly was changed in the compiler that has contributed the most to this improvement? I am aware of several techniques often used, such as better unboxing and strictness analysis. What I would like to have is some rough idea about the quantitative contribution that each of these techniques has brought to the overall performance improvement.

If you prefer, the question can also be framed in the following terms: consider the not-so-great performance of GHC Haskell in the mid 90s. What would be the top 5 areas to improve to bring performance closer to that of 2013 GHC Haskell?

like image 533
Jon Smark Avatar asked Apr 01 '13 20:04

Jon Smark


1 Answers

rough idea about the quantitative contribution that each of these techniques

The problem with this question is that it is essentially unanswerable in that level of detail.

For 15 years all aspects of the software stack, from user code and idioms, to libraries, compiler optimizations, code generation and runtime have been improved. Performance has been a major focus for more than a dozen developers, for several years now.

As a result, thousands of changes have been made, resulting in the performance we get today from GHC. There is no simple list.

Here's a quick list to indicate just how broad and unanswerable this quesiton is.

Compiler Optimizations

Improved compiler optimizations make across-the-board improvements of 1 to 15% each.

  • such as pointer tagging (14% improvement)
  • constructor specialization (10% improvement)
  • a far more robust inliner.

Better Libraries

Improved libraries can have huge impacts on particular domains. E.g. for array and string data we now have:

  • bytestring (8x improvement compared to lists)
  • vector
  • repa

Which are often 10x better than the list versions.

Better interfaces

Better interfaces for writing fast code

  • new primops
  • better numeric conversions.
  • FFI bindings with lower overheads.

Better tools

Better tools for analyzing performance

  • GHC core
  • threadscope
  • better profiler
  • better GC tools

Runtime

The runtime got smarter -- e.g. garbage collector is significantly better

  • the GHC parallel GC, again a few % on every program.
  • IO threads got cheaper and faster.

Code generation

The code generator is better.

  • instead of generating C, GHC targets LLVM improving some array programs by 25%, some by 100%.
  • the native code generator is also rewritten and improved.

Better idioms

And finally, the idioms for writing fast code are now far far more widely understood.

So, you can name any place in the software stack, and a few percent improvements have occured. There have also been major breakthroughts in runtime, compiler and library design.

like image 127
Don Stewart Avatar answered Oct 14 '22 07:10

Don Stewart