What's the most efficient way to concatenate strings, if I already only have ReadOnlySpan slices?
Simplified example:
public class Program {
public string ConcatSpans(string longstring) {
var span = longstring.AsSpan();
var sb = new StringBuilder(longstring.Length);
sb.Append(span.Slice(40, 10));
sb.Append(span.Slice(30, 10));
sb.Append(span.Slice(20, 10));
sb.Append(span.Slice(10, 10));
sb.Append(span.Slice(0, 10));
return sb.ToString();
}
[Benchmark]
public void ConcatSpansBenchmark() {
ConcatSpans("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
}
public static void Main(string[] args) {
var summary = BenchmarkRunner.Run<Program>();
}
}
Results:
BenchmarkDotNet=v0.11.2, OS=Windows 10.0.17134.345 (1803/April2018Update/Redstone4)
Intel Core i5-2500K CPU 3.30GHz (Sandy Bridge), 1 CPU, 4 logical and 4 physical cores
.NET Core SDK=2.1.403
[Host] : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT
DefaultJob : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT
Method | Mean | Error | StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
--------------------- |---------:|---------:|---------:|------------:|------------:|------------:|--------------------:|
ConcatSpansBenchmark | 126.6 ns | 1.712 ns | 1.601 ns | 0.0966 | - | - | 304 B |
Is StringBuilder
really the best we can do? Is there a way to go faster than that? With even less allocations? After all StringBuilder
object itself is a heap object.
If there was a ref struct
StringBuilder
that would only keep references to ReadOnlySpans
and in the final ToString
just allocate one string object?
Edit: as Tseng notes in the comments, the newer string.Create
method is the way to do this on platforms where it exists.
The scenario with multiple (but known) input spans is ideal for a "preallocate a dummy string, then pretend that strings are mutable and overwrite it before exposing it to the world" scenario. This looks gnarly, but this trick is very common in IO code when dealing with strings (especially from discontiguous buffers, etc), so it is well understood and supported.
Here we go (edit: now with added "hybrid" method, which avoids all the Slice()
calls, without needing unsafe
):
Method | Mean | Error | StdDev | Median |
------------------------------ |---------:|----------:|----------:|---------:|
ConcatSpansBenchmark | 97.17 ns | 2.1335 ns | 4.0072 ns | 97.20 ns |
OverwiteStringBenchmark | 63.34 ns | 1.2914 ns | 2.0854 ns | 62.29 ns |
UnsafeOverwriteBenchmark | 17.95 ns | 0.3697 ns | 0.3796 ns | 17.80 ns |
OverwiteStringHybridBenchmark | 53.59 ns | 0.5534 ns | 0.5176 ns | 53.49 ns |
Note: anything involving MemoryMarshal.*
, Unsafe.*
or the unsafe
keyword is explicitly "I know what I'm doing... anything that exploded is probably my fault".
Code:
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Text;
public class Program
{
public string ConcatSpans(string longstring)
{
var span = longstring.AsSpan();
var sb = new StringBuilder(longstring.Length);
sb.Append(span.Slice(40, 10));
sb.Append(span.Slice(30, 10));
sb.Append(span.Slice(20, 10));
sb.Append(span.Slice(10, 10));
sb.Append(span.Slice(0, 10));
return sb.ToString();
}
public string OverwiteString(string longstring)
{
var span = longstring.AsSpan();
var s = new string('\0', longstring.Length);
var writeable = MemoryMarshal.AsMemory(s.AsMemory()).Span;
span.Slice(40, 10).CopyTo(writeable);
writeable = writeable.Slice(10);
span.Slice(30, 10).CopyTo(writeable);
writeable = writeable.Slice(10);
span.Slice(20, 10).CopyTo(writeable);
writeable = writeable.Slice(10);
span.Slice(10, 10).CopyTo(writeable);
writeable = writeable.Slice(10);
span.Slice(0, 10).CopyTo(writeable);
return s;
}
public string OverwiteStringHybrid(string longstring)
{
var source = MemoryMarshal.AsBytes(MemoryMarshal.AsMemory(longstring.AsMemory()).Span);
var s = new string('\0', longstring.Length);
var target = MemoryMarshal.AsBytes(MemoryMarshal.AsMemory(s.AsMemory()).Span);
Unsafe.CopyBlock(ref target[0], ref source[40 * sizeof(char)], 10 * sizeof(char));
Unsafe.CopyBlock(ref target[10], ref source[30 * sizeof(char)], 10 * sizeof(char));
Unsafe.CopyBlock(ref target[20], ref source[20 * sizeof(char)], 10 * sizeof(char));
Unsafe.CopyBlock(ref target[30], ref source[10 * sizeof(char)], 10 * sizeof(char));
Unsafe.CopyBlock(ref target[40], ref source[0], 10 * sizeof(char));
return s;
}
public unsafe string UnsafeOverwrite(string longstring)
{
var s = new string('\0', longstring.Length);
fixed (char* source = longstring)
fixed (char* target = s)
{
Unsafe.CopyBlock(target, source + 40, 10 * sizeof(char));
Unsafe.CopyBlock(target + 10, source + 30, 10 * sizeof(char));
Unsafe.CopyBlock(target + 20, source + 20, 10 * sizeof(char));
Unsafe.CopyBlock(target + 30, source + 10, 10 * sizeof(char));
Unsafe.CopyBlock(target + 40, source, 10 * sizeof(char));
}
return s;
}
[Benchmark]
public void ConcatSpansBenchmark()
=> ConcatSpans("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
[Benchmark]
public void OverwiteStringBenchmark()
=> OverwiteString("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
[Benchmark]
public void UnsafeOverwriteBenchmark()
=> UnsafeOverwrite("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
[Benchmark]
public void OverwiteStringHybridBenchmark()
=> OverwiteStringHybrid("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
public static void Main(string[] args)
=> BenchmarkRunner.Run<Program>();
}
Note: in the general case - to get unsafe
code from a slice:
with C# 7.3:
fixed(char* p = theSpan)
{
...
}
otherwise:
fixed(char* p = &MemoryMarshal.GetReference(theSpan))
{
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With