I'm writing a little Haskell compiler, and I want to implement as much Haskell 2010 as possible. My compiler can parse a module, but completing modules to a program seems to be a non-trivial task. I made up some examples of tricky, but maybe valid, Haskell modules:
module F(G.x) where
import F as G
x = 2
Here the module F
exports G.x
, but G.x
is the same as F.x
, so module F
exports x
if, and only if, it exports x
.
module A(a) where
import B(a)
a = 2
module B(a) where
import A(a)
In this example, to resolve the exports of module A
the compiler has to check if a
imported from B
is the same as the declared a = 2
, but B
exports a
if, and only if, A
exports a
.
module A(f) where
import B(f)
module B(f) where
import A(f)
During resolving module A
, the compiler may've assumed that f
imported from B
exists, implying that A
exports f
, thus B
can import A(f)
and export f
. The only problem is that there's no f
defined anywhere :).
module A(module X) where
import A as X
import B as X
import C as X
a = 2
module B(module C, C.b) where
import C
b = 3
module C(module C)
import B as C
c = 4
Here, the module
exports cause that export lists are dependent on each other and on themselves.
All these examples should be valid Haskell, as defined by the Haskell 2010 spec.
I want to ask if there is any idea how to correctly and completely implement Haskell modules?
Assume that a module contains just (simple) variable bindings, import
s (possibly with as
or qualified
), and exports list of possibly qualified variables and module ...
abbreviations. The algorithm has to be able to:
You may be interested in A Formal Specification for the Haskell 98 Module System.
I'm also covering some interesting edge cases in a series of blog posts, of which only the first one is published as of now.
Finally, I'm working on exactly that — a library that handles Haskell modules. It's called haskell-names.
Depending on your goals, you can simply use it in your compiler, study the source code or contribute. (Your examples will make excellent test cases.)
To answer your question: recursive modules are handled by computing a fixed point.
You start with a strongly connected component in the module graph. For each module in this component, you start with an assumption that it exports nothing. Then you revisit these modules, and compute new export lists based on the fresh information. You can prove that this process is monotonic — every time the export list grows (or, at least, does not shrink). Sooner or later it stops growing — then you've reached the fixed point.
You can optimize this algorithm by borrowing some ideas from static analysis (that community is very good at computing fixed points), but my package currently implement the naive algorithm (code).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With