What is the preferable memory layout of Algebraic Data Types?

Question

Algebraic Data Types (ADTs) are types consisting of unit, product and sum types with possible recursion.

Consider a simple ADT in Haskell:

data Tree = Empty
          | Leaf Int
          | Node Tree Tree

Or a different example in Rust:

enum Message {
    Quit,
    ChangeColor(i32, i32, i32),
    Move { x: i32, y: i32 },
    Write(String),
}

How does Haskell (garbage collected) and Rust (not garbage collected) actually represent these in memory, and how should they be represented?

I'm primarily interested in the non garbage collected case which is simpler, but the solution must be workable both for the heap and stack if non-garbage collected, like for Rust.

Representations in LLVM, or C/C++ are what I'm interested in.

Using the second example, I can think of two options:

Option 1, using unions:

enum MCtor { CQ, CCC, CM, CW  };

struct Message {
    enum MCtor ctor;
    union {
        void*; /* Quit */
        struct { int r; int g; int b; } /* ChangeColor */
        struct { int x; int y; } /* Move */
        struct { char* str; } /* Write */
    };
};

Option 2, using separate allocation, void* and (bit)casts:

enum MCtor { CQ, CCC, CM, CW  };

typedef struct { int r; int g; int b; } ChangeColor;
typedef struct { int x; int y; } Move;
typedef struct { char* str; } Write;

struct Message {
    enum MCtor ctor;
    void* data;
};

Pattern matching is then simply a matter of switch on msg -> ctor.

Which one is preferable, especially considering recursive types?

Off the top of my head, I guess that the locality of the first one is better and that it avoids loads, while it might use more memory... So the second option has a better memory footprint but worse performance...?

ErikR · Accepted Answer

Here are some resources for explaining how GHC implements data structures:

Johan Tibell's talk at ZuriHac 2015 (video) (slides) (talk starts at slide 42)
GHC Commentary: The Layout of Heap Objects

What is the preferable memory layout of Algebraic Data Types?

Tags:

llvm

algebraic-data-types

Centril

1 Answers

ErikR

Recent Activity

Donate For Us

What is the preferable memory layout of Algebraic Data Types?

Tags:

llvm

algebraic-data-types

Centril

1 Answers

ErikR

Related questions

Recent Activity

Donate For Us