Algebraic Data Types (ADTs) are types consisting of unit, product and sum types with possible recursion.
Consider a simple ADT in Haskell:
data Tree = Empty
| Leaf Int
| Node Tree Tree
Or a different example in Rust:
enum Message {
Quit,
ChangeColor(i32, i32, i32),
Move { x: i32, y: i32 },
Write(String),
}
How does Haskell (garbage collected) and Rust (not garbage collected) actually represent these in memory, and how should they be represented?
I'm primarily interested in the non garbage collected case which is simpler, but the solution must be workable both for the heap and stack if non-garbage collected, like for Rust.
Representations in LLVM, or C/C++ are what I'm interested in.
Using the second example, I can think of two options:
Option 1, using unions:
enum MCtor { CQ, CCC, CM, CW };
struct Message {
enum MCtor ctor;
union {
void*; /* Quit */
struct { int r; int g; int b; } /* ChangeColor */
struct { int x; int y; } /* Move */
struct { char* str; } /* Write */
};
};
Option 2, using separate allocation, void*
and (bit)casts:
enum MCtor { CQ, CCC, CM, CW };
typedef struct { int r; int g; int b; } ChangeColor;
typedef struct { int x; int y; } Move;
typedef struct { char* str; } Write;
struct Message {
enum MCtor ctor;
void* data;
};
Pattern matching is then simply a matter of switch
on msg -> ctor
.
Which one is preferable, especially considering recursive types?
Off the top of my head, I guess that the locality of the first one is better and that it avoids loads, while it might use more memory... So the second option has a better memory footprint but worse performance...?
Here are some resources for explaining how GHC implements data structures:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With