Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using array fields instead of massive number of objects

In light of this article, I am wondering what people's experiences are with storing massive datasets (say, >10,000,000 objects) in-memory using arrays to store data fields instead of instantiating millions of objects and racking up the memory overhead (say, 12-24 bytes per object, depending which article you read). Data per property varies from item to item so I can't use a strict Flyweight pattern but would envision something similar.

My idea of this sort of representation is that one has a 'template object'...

class Thing
{
  double A;
  double B;
  int    C;
  string D;
}

And then a container object with a method of creating an object on request...

class ContainerOfThings
{
  double[] ContainerA;
  double[] ContainerB;
  int[]    ContainerC;
  string[] ContainerD;

  ContainerOfThings(int total)
  {
    //create arrays
  }

  IThing GetThingAtPosition(int position)
  {
     IThing thing = new Thing(); //probably best done as a factory instead
     thing.A = ContainerA[position];
     thing.B = ContainerB[position];
     thing.C = ContainerC[position];
     thing.D = ContainerD[position];

     return thing;
  }
}

So that's a simple strategy but not very versatile, for example one can't create a subset (as a List) of 'Thing' without duplicating data and defeating the purpose of array field storage. I haven't been able to find good examples, so I would appreciate either links or code snippets of better ways to handle this scenario from someone who's done it...or a better idea.

like image 953
WolfOdrade Avatar asked Jul 10 '11 06:07

WolfOdrade


1 Answers

It depends on your concrete scenario. Depends on how often your objects are created, you can:

  1. If objects are serializable save them in MemoryMappedFile (obtaining some fusion of middle/low performance and low memory consumption).

  2. Map th fields between different objects: I mean if object initially have default values, have all them in separate base and really allocate a new space if that value becomes different from default one. (this make sense for reference types naturally).

  3. Another solution again save objects to SqlLite base. Much easier to manage than MemoryMappedFiles as you can use simple SQL.

The choice is up to you, as it depends on your concrete project requierements.

Regards.

like image 173
Tigran Avatar answered Nov 10 '22 02:11

Tigran