Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the preferable memory layout of Algebraic Data Types?

Algebraic Data Types (ADTs) are types consisting of unit, product and sum types with possible recursion.

Consider a simple ADT in Haskell:

data Tree = Empty
          | Leaf Int
          | Node Tree Tree

Or a different example in Rust:

enum Message {
    Quit,
    ChangeColor(i32, i32, i32),
    Move { x: i32, y: i32 },
    Write(String),
}

How does Haskell (garbage collected) and Rust (not garbage collected) actually represent these in memory, and how should they be represented?

I'm primarily interested in the non garbage collected case which is simpler, but the solution must be workable both for the heap and stack if non-garbage collected, like for Rust.

Representations in LLVM, or C/C++ are what I'm interested in.

Using the second example, I can think of two options:

Option 1, using unions:

enum MCtor { CQ, CCC, CM, CW  };

struct Message {
    enum MCtor ctor;
    union {
        void*; /* Quit */
        struct { int r; int g; int b; } /* ChangeColor */
        struct { int x; int y; } /* Move */
        struct { char* str; } /* Write */
    };
};

Option 2, using separate allocation, void* and (bit)casts:

enum MCtor { CQ, CCC, CM, CW  };

typedef struct { int r; int g; int b; } ChangeColor;
typedef struct { int x; int y; } Move;
typedef struct { char* str; } Write;

struct Message {
    enum MCtor ctor;
    void* data;
};

Pattern matching is then simply a matter of switch on msg -> ctor.

Which one is preferable, especially considering recursive types?

Off the top of my head, I guess that the locality of the first one is better and that it avoids loads, while it might use more memory... So the second option has a better memory footprint but worse performance...?

like image 513
Centril Avatar asked Nov 09 '22 14:11

Centril


1 Answers

Here are some resources for explaining how GHC implements data structures:

  • Johan Tibell's talk at ZuriHac 2015 (video) (slides) (talk starts at slide 42)
  • GHC Commentary: The Layout of Heap Objects
like image 61
ErikR Avatar answered Jan 04 '23 02:01

ErikR