What's the easiest way to build an F# compiler that runs on the JVM and generates Java bytecode?

Question

The current F# Compiler is written in F#, is open source and runs on .Net and Mono, allowing it to execute on many platforms including Windows, Mac and Linux. F#'s Code Quotations mechanism has been used to compile F# to JavaScript in projects like WebSharper, Pit and FunScript. There also appears to be some interest in running F# code on the JVM.

I believe a version of the OCaml compiler was used to originally Bootstrap the F# compiler.

If someone wanted to build an F# compiler that runs on the JVM would it be easier to:

Modify the existing F# compiler to emit Java bytecode and then compile the F# compiler with it?
Use a JVM based ML compiler like Yeti to Bootstrap a minimal F# compiler on the JVM?
Re-write the F# compiler from scratch in Java as the fjord project appears to be attempting?
Something else?

mythz · Accepted Answer

Another option that should probably be considered is to convert the .NET CLR byte code into JVM byte-code like http://www.ikvm.net does with JVM > CLR byte codes. Although this approach has been considered and dismissed by the fjord owner.

Getting buy-in from the top with option 1) and have the F# compiler team have pluggable backends that could emit Java bytecode sounds in theory like it would produce the most polished solution.

But if you look at other languages that have been ported to different platforms this is rarely the case. Most of the time it's been a rewrite from scratch. But this is also likely due to the original language team having no interest in supporting alternative platforms themselves and that the original host implementation might've not been able to support multiple backends and it's already too slow for this to be a feasible option to start with.

My hunch is a combination of re-writing from scratch and being able to do as much code sharing and automation as possible from the original implementation. E.g. if the test suites could be re-used for both implementations it would take a lot of the burden off the JVM port and go a long way in ensuring language parity.

t0yv0 · Answer

If I really had to do this, I would probably start with the #1 approach - add JVM backend to the existing compiler. But I would also try to argue for a different target VM.

Quotations are not very relevant - as an author of WebSharper I can assure you that while quotations can give you a nice F#-like language to program with, they are restrictive, and not optimized. I imagine that for potential JVM F# users the bar would be a lot higher - full language compatibility and comparable performance. This is very hard.

Take tail calls, for example. In WebSharper we apply heuristics to optimize some local tail calls to loops in JavaScript, but that is not enough - you cannot in general rely on TCO, as you do in general F# libraries. This is ok for WebSharper as our users do not expect to have full F#, but will not be ok for a JVM F# port. I believe most JVM implementations do not do TCO, so it will have to be implemented with some indirection, introducing a performance hit.

An bytecode re-compilation approach mentioned by @mythz sounds very attractive as it allows more than just porting F# - ideally it allows porting more .NET software to the JVM. I worked quite a bit with .NET bytecode analysis on an internal WebSharper 3.0 project - we are looking at the option of compiling .NET bytecode instead of F# quotations to JavaScript. But there are huge challenges there:

A lot of code in BCL is opaque (native) - and you cannot decompile it
The generics model is fairly complicated. I have implemented a JavaScript runtime that models class and method generics, instantiation, type generation, and basic reflection with some precision and reasonable performance. This was difficult enough in dynamic JavaScript with closures and is seems quite difficult to do in a performant way on the JVM - but maybe I just do not see a simple solution.
Value types create significant complications in the bytecode. I am yet to figure this one out for WebSharper 3.0. They cannot be ignored either, as they are used extensively by many libraries you would want ported.
Similarly, basic reflection is used in many real-world .NET libraries - and it is a nightmare to cross-compile in terms of both lots of native code and proper support for generics and value types.

Also, the bytecode approach does not remove the question on how to implement tail calls. AFAIK, Scala does not implement tailcalls. They have certainly the talent and the funding to do that - the fact that they do not, tells me a lot about how practical it is to do TCO on the JVM. For our .NET->JavaScript port I will probably go a similar route - no TCO guarantees unless you specifically ask for trampolining which will work but cost you an order of magnitude (or two) in performance.

What's the easiest way to build an F# compiler that runs on the JVM and generates Java bytecode?

Tags:

jvm

f#

ocaml

yeti

funscript

Phillip Trelford

2 Answers

mythz

t0yv0

Recent Activity

Donate For Us

What's the easiest way to build an F# compiler that runs on the JVM and generates Java bytecode?

Tags:

jvm

f#

ocaml

yeti

funscript

Phillip Trelford

2 Answers

mythz

t0yv0

Related questions

Recent Activity

Donate For Us