I am looking at the Roslyn September 2012 CTP with Reflector, and I noticed that the SlidingTextWindow class has the following:
internal sealed class SlidingTextWindow : IDisposable
{
private static readonly ConcurrentQueue<char[]> arrayPool = new ConcurrentQueue<char[]>();
private int basis;
private readonly LexerBaseCache cache;
private char[] characterWindow;
private int characterWindowCount;
private int characterWindowStart;
private int offset;
private readonly IText text;
private readonly int textEnd;
public SlidingTextWindow(IText text, LexerBaseCache cache)
{
this.text = text;
this.basis = 0;
this.characterWindowStart = 0;
this.offset = 0;
this.textEnd = text.Length;
this.cache = cache;
if (!arrayPool.TryDequeue(out this.characterWindow))
{
this.characterWindow = new char[2048];
}
}
public void Dispose()
{
arrayPool.Enqueue(this.characterWindow);
this.characterWindow = null;
}
// ...
}
I believe the purpose of this class is to provide fast access to substrings of the input text, by using char[] characterWindow
, starting with 2048 characters at a time (although the characterWindow
may grow). I believe this is because it is faster to take substrings of character arrays than of strings, as Eric Lippert seems to indicate on his blog.
The SlidingTextWindow
class is instantiated each time the Lexer
class is instantiated, which happens each call to SyntaxTree.ParseText
.
I do not understand the purpose of the arrayPool
field. Its only usage in this class is in the constructor and Dispose methods. When calling SyntaxTree.ParseText
, there seems to be only one instance of the Lexer
class and of the SlidingTextWindow
class created. What advantage is gained by enqueuing the characterWindow
when an instance is disposed and by trying to dequeue a characterWindow
when an instance is created?
Perhaps somebody from the Roslyn team could help me understand this?
The advantage is that collection pressure is reduced, which has a positive effect on overall performance.
The .NET garbage collector is of course a general-purpose garbage collector. The allocation and object lifetime patterns of a compiler and IDE are quite different than those of your average line-of-business application, and they tend to stress the GC in unusual ways.
If you look throughout Roslyn there are many places where small arrays are cached and re-used later rather than allowing the GC to identify them as short-lived trash and reclaim them immediately. Empirical experiments show that this gives a measurable improvement in performance.
I don't recommend doing so in your own application unless your profiling indicates that you have a measurable performance problem gated on collection pressure. For the vast majority of applications the GC is very well tuned, and the benefit of a pooling strategy is not worth the considerable costs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With