I'm looking at reducing the memory consumption of a table like collection object.
Given a class structure like
Class Cell
{
public property int Data;
public property string Format;
}
Class Table
{
public property Dictionary<Position, Cell> Cells;
}
When there are a large number of cells the Data property of the Cell class may be variable but the Format property may be repeated many times, e.g. the header cells may have an empty format string for titles and the data cells may all be "0.00".
One idea is to something like the following
Class Cell
{
public property int Data;
public property int FormatId;
}
Class Table
{
public property Dictionary<Position, Cell> Cells;
private property Dictionary<Position, string> Formats;
public string GetCellFormat(Position);
}
This would save memory on strings however the FormatId integer value would still be repeated many times.
Is there a better implementation than this? I've looked at the flyweight pattern but am unsure if it matches this.
A more complex implementation I am considering is removing the Format property from the Cell class altogether and instead storing the Formats in a dictionary that groups adjacent cells together
e.g. there may be 2 entries like this<item rowFrom=1 rowTo=1 format="" /
><item romFrom=2 rowTo=1000 format="0.00" /
>
For strings, you could perhaps look at interning; either with the inbuilt interner, or (preferably) a custom interner - basically a Dictionary<string,string>
. What this means is that each identical string uses the same reference - and the duplicates can be collected.
Don't do anything with the int; that is already optimal.
For example:
using System;
using System.Collections.Generic;
class StringInterner {
private readonly Dictionary<string, string> lookup
= new Dictionary<string, string>();
public string this[string value] {
get {
if(value == null) return null;
if(value == "") return string.Empty;
string result;
lock (lookup) { // remove if not needed to be thread-safe
if (!lookup.TryGetValue(value, out result)) {
lookup.Add(value, value);
result = value;
}
}
return result;
}
}
public void Clear() {
lock (lookup) { lookup.Clear(); }
}
}
static class Program {
static void Main() {
// this line is to defeat the inbuilt compiler interner
char[] test = { 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd' };
string a = new string(test), b = new string(test);
Console.WriteLine(ReferenceEquals(a, b)); // false
StringInterner cache = new StringInterner();
string c = cache[a], d = cache[b];
Console.WriteLine(ReferenceEquals(c, d)); // true
}
}
You could take this further with WeakReference
if desired.
Note importantly that you don't need to change your design - you just change the code that populates the object to use the interner/cache.
Have you actually determined whether or not this is actually a problem? The CLR does a lot of string interning on your behalf so it is possible (depending on CLR version and how your code was compiled) that you are not using as much memory as you think you are.
I would highly recommend that you validate your suspicions about memory utilization before you change your design.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With