Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Whats the difference between InplaceStringBuilder and StringBuilder?

Today in VS2017 intellisense surfaced InplaceStringBuilder when I was trying to type StringBuilder. InplaceStringBuilder is new to me so I started digging to see what I could learn.

The first thing I noticed is that it's a struct and not a class and it's type information looks like this:

#region Assembly Microsoft.Extensions.Primitives, Version=1.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60
// C:\Users\Ron Clabo\Documents\Visual Studio 2017\Projects\wwwGiftOasisResponsive\packages\Microsoft.Extensions.Primitives.1.1.0\lib\netstandard1.0\Microsoft.Extensions.Primitives.dll
#endregion

using System.Diagnostics;

namespace Microsoft.Extensions.Primitives {
    [DebuggerDisplay("Value = {_value}")]
    public struct InplaceStringBuilder {
        public InplaceStringBuilder(int capacity);

        public int Capacity { get; set; }

        public void Append(string s);
        public void Append(char c);
        public override string ToString();
    }
}

So it has alot fewer methods than StringBuilder. Then I googled around to learn more about InplaceStringBuilder but there's not much on the web about it yet, so it looks to be pretty new.

Besides, the differences I have already mentioned, what are the differences between InplaceStringBuilder and StringBuilder; and when should a developer use the new InplaceStringBuilder instead of the age old StringBuilder?

like image 526
RonC Avatar asked Mar 22 '17 18:03

RonC


3 Answers

Since it only does one allocation, InplaceStringBuilder is more efficient for well-known, reasonably-sized strings. We might want to use it in methods that need to be very efficient.

It was introduced with pull request #157, which includes the following commentary.

Intended to be used instead of pooled StringBuilder or string.Concat when all parts of string are known... Does only 1 allocation of resulting string... should only be used for well-known reasonably sized strings. For everything else, use StringBuilder... do not use across await points...

The history of the PR tells the story:

  1. July 2016, Issue #676 notices unnecessary allocations.
  2. Sept 2016, Pull request #699 resolves issue #676 and proposes factoring out the "inplace string formatting into a struct..."
  3. Sept 2016, Issue #717 formalizes the proposal.
  4. Sept 2016, Pull request #157 implements the proposal.
like image 120
Shaun Luttin Avatar answered Sep 24 '22 20:09

Shaun Luttin


The normal StringBuilder increases the capacity when the resulting string is longer than the initial capacity. The InplaceStringBuilder is limited to its capacity and throws an exception if the resulting string is longer.

This is quite a limitation of InplaceStringBuilder, so it may only be suitable in rare cases. In addition, if you know the capacity in advance, it was already possible to define the initial capacity of a normal StringBuilder.

Source: look at the implementation of InplaceStringBuilder on GitHub and compare to StringBuilder on MSDN

like image 35
Thomas Weller Avatar answered Sep 21 '22 20:09

Thomas Weller


InplaceStringBuilder is a very fast way to build a string by appending chunks, when you know in advance how long the final string is going to be. It works by preallocating a string of fixed size, and then mutating that string (unsafely, through a pointer) as you append chunks to the builder. When you call ToString, the preallocated string, now full of data, is returned directly without copying. (StringBuilder makes a copy of its content when you call ToString.)

Strings are ordinarily immutable, so you'd better make sure not to mutate the string by calling Append after you've called ToString. InplaceStringBuilder tries to make this safe by only allowing appends (you can't rewind and write to a part of the string which has already been appended) and by requiring you to fill the entire string before calling ToString.

However, InplaceStringBuilder is a mutable struct, which means it's passed around by copying. If you copy the builder (eg by passing it as an argument), the copy can get out of sync with the original. Specifically, the _offset field, which tracks how many characters have been appended so far (in order to know where to write the next Append call), of the copy may refer to a location in the string which has already been written by the original.

This is to say, InplaceStringBuilder is unsafe and risky to use. If not handled carefully, you risk breaking one of string's most important properties, immutability. Make sure you know what you're doing!

like image 35
Benjamin Hodgson Avatar answered Sep 22 '22 20:09

Benjamin Hodgson