Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reducing memory of similar objects

I'm looking at reducing the memory consumption of a table like collection object.

Given a class structure like

Class Cell
{
    public property int Data;
    public property string Format;
}

Class Table
{
    public property Dictionary<Position, Cell> Cells;
}

When there are a large number of cells the Data property of the Cell class may be variable but the Format property may be repeated many times, e.g. the header cells may have an empty format string for titles and the data cells may all be "0.00".

One idea is to something like the following

Class Cell
{
    public property int Data;
    public property int FormatId;
}
Class Table
{
    public property Dictionary<Position, Cell> Cells;
    private property Dictionary<Position, string> Formats;

    public string GetCellFormat(Position);
}

This would save memory on strings however the FormatId integer value would still be repeated many times.

Is there a better implementation than this? I've looked at the flyweight pattern but am unsure if it matches this.

A more complex implementation I am considering is removing the Format property from the Cell class altogether and instead storing the Formats in a dictionary that groups adjacent cells together
e.g. there may be 2 entries like this
<item rowFrom=1 rowTo=1 format="" />
<item romFrom=2 rowTo=1000 format="0.00" />

like image 669
Chris Herring Avatar asked Jan 23 '23 00:01

Chris Herring


2 Answers

For strings, you could perhaps look at interning; either with the inbuilt interner, or (preferably) a custom interner - basically a Dictionary<string,string>. What this means is that each identical string uses the same reference - and the duplicates can be collected.

Don't do anything with the int; that is already optimal.

For example:

using System;
using System.Collections.Generic;
class StringInterner {
    private readonly Dictionary<string, string> lookup
        = new Dictionary<string, string>();
    public string this[string value] {
        get {
            if(value == null) return null;
            if(value == "") return string.Empty;
            string result;
            lock (lookup) { // remove if not needed to be thread-safe     
                if (!lookup.TryGetValue(value, out result)) {
                    lookup.Add(value, value);
                    result = value;
                }
            }
            return result;
        }
    }
    public void Clear() {
        lock (lookup) { lookup.Clear(); }
    }
}
static class Program {
    static void Main() {
        // this line is to defeat the inbuilt compiler interner
        char[] test = { 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd' };

        string a = new string(test), b = new string(test);
        Console.WriteLine(ReferenceEquals(a, b)); // false
        StringInterner cache = new StringInterner();
        string c = cache[a], d = cache[b];
        Console.WriteLine(ReferenceEquals(c, d)); // true
    }
}

You could take this further with WeakReference if desired.

Note importantly that you don't need to change your design - you just change the code that populates the object to use the interner/cache.

like image 177
Marc Gravell Avatar answered Jan 25 '23 13:01

Marc Gravell


Have you actually determined whether or not this is actually a problem? The CLR does a lot of string interning on your behalf so it is possible (depending on CLR version and how your code was compiled) that you are not using as much memory as you think you are.

I would highly recommend that you validate your suspicions about memory utilization before you change your design.

like image 36
Andrew Hare Avatar answered Jan 25 '23 13:01

Andrew Hare