Update: It is acceptable if this method is not thread safe, but I'm interested in learning how I would make it thread safe. Also, I do not want to lock on a single object for all values of key
if I can avoid it.
Original Question: Suppose I want to write a higher order function that takes a key and a function, and checks if an object has been cached with the given key. If is has, the cached value is returned. Otherwise, the given function is run and the result is cached and returned.
Here's a simplified version of my code:
public static T CheckCache<T>(string key, Func<T> fn, DateTime expires)
{
object cache = HttpContext.Current.Cache.Get(key);
//clearly not thread safe, two threads could both evaluate the below condition as true
//what can I lock on since the value of "key" may not be known at compile time?
if (cache == null)
{
T result = fn();
HttpContext.Current.Cache.Insert(key, result, null, expires, Cache.NoSlidingExpiration);
return result;
}
else
return (T)cache;
}
Also, suppose I do not know all possible values of key
at compile time.
How can I make this thread safe? I know I need to introduce locking here, to prevent 1+ threads from evaluating my condition as true, but I don't know what to lock on. Many of the examples I've read about locking (such as Jon Skeet's article) recommend using a "dummy" private variable that's used only for locking. This isn't possible in this case, because keys are unknown at compile time. I know I could trivially make this thread safe by having the same lock be used for every key
, but that could be wasteful.
Now, my main question is:
Is is possible to lock on key
? Will string interning help here?
After reading .NET 2.0 string interning inside out, I understand that I can explicitly call String.Intern()
to obtain a 1 to 1 mapping from the value of a string to instance of a string. Is this suitable to lock on? Let's change the above code to:
public static T CheckCache<T>(string key, Func<T> fn, DateTime expires)
{
//check for the scenario where two strings with the same value are stored at different memory locations
key = String.Intern(key);
lock (key) //is this object suitable for locking?
{
object cache = HttpContext.Current.Cache.Get(key);
if (cache == null)
{
T result = fn();
HttpContext.Current.Cache.Insert(key, result, null, expires, Cache.NoSlidingExpiration);
return result;
}
else
return (T)cache;
}
}
Is the above implementation thread safe?
Problems with @wsanville's own solution, partly mentioned before:
String.Intern
locking pattern) - note that this includes locks on the same interned string even if they are in different AppDomains, potentially leading to cross-AppDomain deadlocksString.Intern()
is slowTo address all these 3 issues, you could implement your own Intern()
that you tie to your specific locking purpose, i.e. do not use it as a global, general-purpose string interner:
private static readonly ConcurrentDictionary<string, string> concSafe =
new ConcurrentDictionary<string, string>();
static string InternConcurrentSafe(string s)
{
return concSafe.GetOrAdd(s, String.Copy);
}
I called this method ...Safe()
, because when interning I will not store the passed in String
instance, as that might e.g. be an already interned String
, making it subject to the problems mentioned in 1. above.
To compare the performance of various ways of interning strings, I also tried the following 2 methods, as well as String.Intern
.
private static readonly ConcurrentDictionary<string, string> conc =
new ConcurrentDictionary<string, string>();
static string InternConcurrent(string s)
{
return conc.GetOrAdd(s, s);
}
private static readonly Dictionary<string, string> locked =
new Dictionary<string, string>(5000);
static string InternLocked(string s)
{
string interned;
lock (locked)
if (!locked.TryGetValue(s, out interned))
interned = locked[s] = s;
return interned;
}
Benchmark
100 threads, each randomly selecting one of 5000 different strings (each containing 8 digits) 50000 times and then calling the respective intern method. All values after warming up sufficiently. This is Windows 7, 64bit, on a 4core i5.
N.B. Warming up the above setup implies that after warming up, there won't be any writes to the respective interning dictionaries, but only reads. It's what I was interested in for the use case at hand, but different write/read ratios will probably affect the results.
Results
String.Intern
(): 2032 msInternLocked()
: 1245 msInternConcurrent()
: 458 msInternConcurrentSafe()
: 453 msThe fact that InternConcurrentSafe
is as fast as InternConcurrent
makes sense in light of the fact that these figures are after warming up (see above N.B.), so there are in fact no or only a few invocations of String.Copy
during the test.
public class StringLocker
{
private readonly ConcurrentDictionary<string, string> _locks =
new ConcurrentDictionary<string, string>();
public string GetLockObject(string s)
{
return _locks.GetOrAdd(s, String.Copy);
}
}
and after instantiating one StringLocker
for every use case you might have, it is as easy as calling
lock(myStringLocker.GetLockObject(s))
{
...
N.B.
Thinking again, there's no need to return an object of type string
if all you want to do is lock on it, so copying the characters is totally unnecessary, and the following would perform better than above class.
public class StringLocker
{
private readonly ConcurrentDictionary<string, object> _locks =
new ConcurrentDictionary<string, object>();
public object GetLockObject(string s)
{
return _locks.GetOrAdd(s, k => new object());
}
}
A variant of Daniel's answer...
Rather than creating a new lock object for every single string you could share a small-ish set of locks, choosing which lock to use depending on the string's hashcode. This will mean less GC pressure if you potentially have thousands, or millions, of keys, and should allow enough granularity to avoid any serious blocking (perhaps after a few tweaks, if necessary).
public static T CheckCache<T>(string key, Func<T> fn, DateTime expires)
{
object cached = HttpContext.Current.Cache[key];
if (cached != null)
return (T)cached;
int stripeIndex = (key.GetHashCode() & 0x7FFFFFFF) % _stripes.Length;
lock (_stripes[stripeIndex])
{
T result = fn();
HttpContext.Current.Cache.Insert(key, result, null, expires,
Cache.NoSlidingExpiration);
return result;
}
}
// share a set of 32 locks
private static readonly object[] _stripes = Enumerable.Range(0, 32)
.Select(x => new object())
.ToArray();
This will allow you to tweak the locking granularity to suit your particular needs just by changing the number of elements in the _stripes
array. (However, if you need close to one-lock-per-string granularity then you're better off going with Daniel's answer.)
Never lock on strings. In particular on those that are interned. See this blog entry on the danger of locking on interned strings.
Just create a new object and lock on that:
object myLock = new object();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With