Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I implement a data oriented design in Rust?

Background

In game engine development we usually use a data oriented design for optimal memory and computation performance.

Let's take particle system as an example.

In a particle system, we have a lot of particles, and each particle may have several attributes such as positions, velocities, etc.

A typical implementation in C++ would be like this:

struct Particle {
    float positionX, positionY, positionZ;
    float velocityX, velocityY, velocityZ;
    float mass;
    // ...
};

struct ParticleSystem {
    vector<Particle> particles;
    // ...
};

One problem of this implementation is that the particle attributes are interleaved with each other. This memory layout is not cache friendly and may not suitable for SIMD computations.

Instead in a data oriented design, we write following code:

struct ParticleAttribute {
    size_t size;
    size_t alignment;
    const char* semantic;
};

struct ParticleSystem {
    ParticleSystem(
        size_t numParticles,
        const ParticleAttribute* attributes,
        size_t bufferSize) {
        for (size_t i = 0; i < numAttributes; ++i) {
            bufferSize += attributes[i].size * numParticles;
            // Also add paddings to satisfy the alignment requirements.
        }
        particleBuffer = malloc(bufferSize); 
    }

    uint8* getAttribute(const char* semantic) {
        // Locate the semantic in attributes array.
        // Compute the offset to the starting address of that attribute.
    }

    uint8* particleBuffer;      
};

Now we have only one allocation and each attribute resides in memory continuously. To simulate the particles, we may write following code:

symplecticEuler(ps.getAttribute("positionX"), ps.getAttribute("velocityX"), dt);

The getAttribute function will get the starting address of a particular attribute.

Question

I would like to know how to implement this in Rust.

My idea is to first create a class called ParticleSystem, which takes several ParticleAttributes to calculate the total buffer size, then allocate the memory for the buffer. I think this can be done in Rust safe code.

The next step is to implement getAttribute function, which will returns a reference to the starting address of a specific attribute. I need your help here. How do I get the raw address with an offset and cast it to a desired type(such as float*) and wrap that raw pointer to a mutable reference in Rust?

In addition, I think I should wrap that raw pointer to a mutable reference to array because I need to use SIMD lib to load four elements through that reference. How do I achieve this using Rust?


Update: provide more information about the attributes. The number and detailed information of attributes are determined in runtime. The types of attributes can vary, but I think we only have to support the primitive ones(f32, f64, ints,...).

like image 229
TheBusyTypist Avatar asked Sep 28 '16 18:09

TheBusyTypist


1 Answers

That's a very complicated way of implementing DOD, and the idea of using run-time lookup for getters makes me cringe.

The simple version is to simply have one memory allocation per attribute:

struct Particles {
    x: Vec<f32>,
    y: Vec<f32>,
}

which requires knowing the attributes beforehand.

Then there is no shenanigan for getting all the ys, they are just sitting there, already typed, waiting for you.


Extending this to dynamically determined attributes is not that complicated:

  • we can use a HashMap<String, xxx> to look-up a given attribute at run-time
  • we can use an enum to have a single Value to be stored in the hash-map which can take a variety of forms (the other solution would be using a trait)

This becomes:

#[derive(Debug, Hash, PartialEq, Eq)]
enum Value {
    UniformInt(i64),
    UniformFloat32(f32),
    UniformFloat64(f64),
    DistinctInt(Vec<i64>),
    DistinctFloat32(Vec<f32>),
    DistinctFloat64(Vec<f64>),
}

struct Particles {
    store: HashMap<String, Value>,
}

We could alternatively use 6 hash-maps... but unless one knows a priori what the type is (when the only thing one has is a string), then one has to look through all hashmaps one at a time: annoying, and time wasting.

like image 56
Matthieu M. Avatar answered Nov 19 '22 05:11

Matthieu M.