F# performance in scientific computing

Question

I am curious as to how F# performance compares to C++ performance? I asked a similar question with regards to Java, and the impression I got was that Java is not suitable for heavy numbercrunching.

I have read that F# is supposed to be more scalable and more performant, but how is this real-world performance compares to C++? specific questions about current implementation are:

How well does it do floating-point?
Does it allow vector instructions
how friendly is it towards optimizing compilers?
How big a memory foot print does it have? Does it allow fine-grained control over memory locality?
does it have capacity for distributed memory processors, for example Cray?
what features does it have that may be of interest to computational science where heavy number processing is involved?
Are there actual scientific computing implementations that use it?

Thanks

J D · Accepted Answer

I am curious as to how F# performance compares to C++ performance?

Varies wildly depending upon the application. If you are making extensive use of sophisticated data structures in a multi-threaded program then F# is likely to be a big win. If most of your time is spent in tight numerical loops mutating arrays then C++ might be 2-3× faster.

Case study: Ray tracer My benchmark here uses a tree for hierarchical culling and numerical ray-sphere intersection code to generate an output image. This benchmark is several years old and the C++ code has been improved upon dozens of times over the years and read by hundreds of thousands of people. Don Syme at Microsoft managed to write an F# implementation that is slightly faster than the fastest C++ code when compiled with MSVC and parallelized using OpenMP.

I have read that F# is supposed to be more scalable and more performant, but how is this real-world performance compares to C++?

Developing code is much easier and faster with F# than C++, and this applies to optimization as well as maintenance. Consequently, when you start optimizing a program the same amount of effort will yield much larger performance gains if you use F# instead of C++. However, F# is a higher-level language and, consequently, places a lower ceiling on performance. So if you have infinite time to spend optimizing you should, in theory, always be able to produce faster code in C++.

This is exactly the same benefit that C++ had over Fortran and Fortran had over hand-written assembler, of course.

Case study: QR decomposition This is a basic numerical method from linear algebra provided by libraries like LAPACK. The reference LAPACK implementation is 2,077 lines of Fortran. I wrote an F# implementation in under 80 lines of code that achieves the same level of performance. But the reference implementation is not fast: vendor-tuned implementations like Intel's Math Kernel Library (MKL) are often 10x faster. Remarkably, I managed to optimize my F# code well beyond the performance of Intel's implementation running on Intel hardware whilst keeping my code under 150 lines of code and fully generic (it can handle single and double precision, and complex and even symbolic matrices!): for tall thin matrices my F# code is up to 3× faster than the Intel MKL.

Note that the moral of this case study is not that you should expect your F# to be faster than vendor-tuned libraries but, rather, that even experts like Intel's will miss productive high-level optimizations if they use only lower-level languages. I suspect Intel's numerical optimization experts failed to exploit parallelism fully because their tools make it extremely cumbersome whereas F# makes it effortless.

How well does it do floating-point?

Performance is similar to ANSI C but some functionality (e.g. rounding modes) is not available from .NET.

Does it allow vector instructions

No.

how friendly is it towards optimizing compilers?

This question does not make sense: F# is a proprietary .NET language from Microsoft with a single compiler.

How big a memory foot print does it have?

An empty application uses 1.3Mb here.

Does it allow fine-grained control over memory locality?

Better than most memory-safe languages but not as good as C. For example, you can unbox arbitrary data structures in F# by representing them as "structs".

does it have capacity for distributed memory processors, for example Cray?

Depends what you mean by "capacity for". If you can run .NET on that Cray then you could use message passing in F# (just like the next language) but F# is intended primarily for desktop multicore x86 machines.

what features does it have that may be of interest to computational science where heavy number processing is involved?

Memory safety means you do not get segmentation faults and access violations. The support for parallelism in .NET 4 is good. The ability to execute code on-the-fly via the F# interactive session in Visual Studio 2010 is extremely useful for interactive technical computing.

Are there actual scientific computing implementations that use it?

Our commercial products for scientific computing in F# already have hundreds of users.

However, your line of questioning indicates that you think of scientific computing as high-performance computing (e.g. Cray) and not interactive technical computing (e.g. MATLAB, Mathematica). F# is intended for the latter.

Tomas Petricek · Answer

In addition to what others said, there is one important point about F# and that's parallelism. The performance of ordinary F# code is determined by CLR, although you may be able to use LAPACK from F# or you may be able to make native calls using C++/CLI as part of your project.

However, well-designed functional programs tend to be much easier to parallelize, which means that you can easily gain performance by using multi-core CPUs, which are definitely available to you if you're doing some scientific computing. Here are a couple of relevant links:

F# and Task-Parallel library (blog by Jurgen van Gael, who is doing machine-learning stuff)
Another interesting answer at SO regarding parllelism
An example of using Parallel LINQ from F#
Chapter 14 of my book discusses parallelism (source code is available)

Regarding distributed computing, you can use any distributed computing framework that's available for the .NET platform. There is a MPI.NET project, which works well with F#, but you may be also able to use DryadLINQ, which is a MSR project.

Some articles: F# MPI tools for .NET, Concurrency with MPI.NET
DryadLINQ project hompepage

Joh · Answer

F# does floating point computation as fast as the .NET CLR will allow it. Not much difference from C# or other .NET languages.
F# does not allow vector instructions by itself, but if your CLR has an API for these, F# should not have problems using it. See for instance Mono.
As far as I know, there is only one F# compiler for the moment, so maybe the question should be "how good is the F# compiler when it comes to optimisation?". The answer is in any case "potentially as good as the C# compiler, probably a little bit worse at the moment". Note that F# differs from e.g. C# in its support for inlining at compile time, which potentially allows for more efficient code which rely on generics.
Memory foot prints of F# programs are similar to that of other .NET languages. The amount of control you have over allocation and garbage collection is the same as in other .NET languages.
I don't know about the support for distributed memory.
F# has very nice primitives for dealing with flat data structures, e.g. arrays and lists. Look for instance at the content of the Array module: map, map2, mapi, iter, fold, zip... Arrays are popular in scientific computing, I guess due to their inherently good memory locality properties.
For scientific computation packages using F#, you may want to look at what Jon Harrop is doing.

F# performance in scientific computing

Tags:

c++

performance

parallel-processing

f#

scientific-computing

Anycorn

3 Answers

J D

Tomas Petricek

Joh

Recent Activity

Donate For Us

F# performance in scientific computing

Tags:

c++

performance

parallel-processing

f#

scientific-computing

Anycorn

3 Answers

J D

Tomas Petricek

Joh

Related questions

Recent Activity

Donate For Us