Before we start, I am aware of the term "premature optimization". However the following snippets have proven to be an area where improvements can be made.
Alright. We currently have some network code that works with string based packets. I am aware that using strings for packets is stupid, crazy and slow. Sadly, we don't have any control over the client and so have to use strings.
Each packet is terminated by \0\r\n
and we currently use a StreamReader/Writer to read individual packets from the stream. Our main bottleneck comes from two places.
Firstly: We need to trim that nasty little null-byte off the end of the string. We currently use code like the following:
line = await reader.ReadLineAsync();
line = line.Replace("\0", ""); // PERF this allocates a new string
if (string.IsNullOrWhiteSpace(line))
return null;
var packet = ClientPacket.Parse(line, cl.Client.RemoteEndPoint);
As you can see by that cute little comment, we have a GC performance issue when trimming the '\0'. There are numerous different ways you could trim a '\0' off the end of a string, but all will result in the same GC hammering we get. Because all string operations are immutable, they result in a new string object being created. As our server handles 1000+ connections all communicating at around 25-40 packets per second (its a game server), this GC matter is becoming an issue. So here comes my first question: What is a more efficient way of trimming that '\0' off the end of our string? By efficient I don't only mean speed, but also GC wise (ultimately I'd like a way to get rid of it without creating a new string object!).
Our second issue also stems from GC land. Our code looks somewhat like the following:
private static string[] emptyStringArray = new string[] { }; // so we dont need to allocate this
public static ClientPacket Parse(string line, EndPoint from)
{
const char seperator = '|';
var first_seperator_pos = line.IndexOf(seperator);
if (first_seperator_pos < 1)
{
return new ClientPacket(NetworkStringToClientPacketType(line), emptyStringArray, from);
}
var name = line.Substring(0, first_seperator_pos);
var type = NetworkStringToClientPacketType(name);
if (line.IndexOf(seperator, first_seperator_pos + 1) < 1)
return new ClientPacket(type, new string[] { line.Substring(first_seperator_pos + 1) }, from);
return new ClientPacket(type, line.Substring(first_seperator_pos + 1).Split(seperator), from);
}
(Where NetworkStringToClientPacketType
is simply a big switch-case block)
As you can see we already do a few things to handle GC. We reuse a static "empty" string and we check for packets with no parameters. My only issue here is that we are using Substring a lot, and even chain a Split on the end of a Substring. This leads to (for an average packet) almost 20 new string objects being created and 12 being disposed of EACH PACKET. This causes a lot of performance issues when load increases anything over 400 users (we gotz fast ram :3)
Has anyone had an experience with this sort of thing before or could give us some pointers into what to look into next? Maybe some magical classes or some nifty pointer magic?
(PS. StringBuilder doesn't help as we aren't building strings, we are generally splitting them.)
We currently have some ideas based on an index based system where we store the index and length of each parameter rather than splitting them. Thoughts?
A few other things. Decompiling mscorlib and browsing the string class code, it seems to me like IndexOf
calls are done via P/Invoke, which would mean they have added overhead for each call, correct me if I'm wrong? Would it not be faster to implement an IndexOf
manually using a char[]
array?
public int IndexOf(string value, int startIndex, int count, StringComparison comparisonType)
{
...
return TextInfo.IndexOfStringOrdinalIgnoreCase(this, value, startIndex, count);
...
}
internal static int IndexOfStringOrdinalIgnoreCase(string source, string value, int startIndex, int count)
{
...
if (TextInfo.TryFastFindStringOrdinalIgnoreCase(4194304, source, startIndex, value, count, ref result))
{
return result;
}
...
}
...
[DllImport("QCall", CharSet = CharSet.Unicode)]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool InternalTryFindStringOrdinalIgnoreCase(int searchFlags, string source, int sourceCount, int startIndex, string target, int targetCount, ref int foundIndex);
Then we get to String.Split which ends up calling Substring
itself (somewhere along the line):
// string
private string[] InternalSplitOmitEmptyEntries(int[] sepList, int[] lengthList, int numReplaces, int count)
{
int num = (numReplaces < count) ? (numReplaces + 1) : count;
string[] array = new string[num];
int num2 = 0;
int num3 = 0;
int i = 0;
while (i < numReplaces && num2 < this.Length)
{
if (sepList[i] - num2 > 0)
{
array[num3++] = this.Substring(num2, sepList[i] - num2);
}
num2 = sepList[i] + ((lengthList == null) ? 1 : lengthList[i]);
if (num3 == count - 1)
{
while (i < numReplaces - 1)
{
if (num2 != sepList[++i])
{
break;
}
num2 += ((lengthList == null) ? 1 : lengthList[i]);
}
break;
}
i++;
}
if (num2 < this.Length)
{
array[num3++] = this.Substring(num2);
}
string[] array2 = array;
if (num3 != num)
{
array2 = new string[num3];
for (int j = 0; j < num3; j++)
{
array2[j] = array[j];
}
}
return array2;
}
Thankfully Substring looks fast (and efficient!):
private unsafe string InternalSubString(int startIndex, int length, bool fAlwaysCopy)
{
if (startIndex == 0 && length == this.Length && !fAlwaysCopy)
{
return this;
}
string text = string.FastAllocateString(length);
fixed (char* ptr = &text.m_firstChar)
{
fixed (char* ptr2 = &this.m_firstChar)
{
string.wstrcpy(ptr, ptr2 + (IntPtr)startIndex, length);
}
}
return text;
}
After reading this answer here, I'm thinking a pointer based solution could be found... Thoughts?
Thanks.
You could "cheat" and work at the Encoder
level...
public class UTF8NoZero : UTF8Encoding
{
public override Decoder GetDecoder()
{
return new MyDecoder();
}
}
public class MyDecoder : Decoder
{
public Encoding UTF8 = new UTF8Encoding();
public override int GetCharCount(byte[] bytes, int index, int count)
{
return UTF8.GetCharCount(bytes, index, count);
}
public override int GetChars(byte[] bytes, int byteIndex, int byteCount, char[] chars, int charIndex)
{
int count2 = UTF8.GetChars(bytes, byteIndex, byteCount, chars, charIndex);
int i, j;
for (i = charIndex, j = charIndex; i < charIndex + count2; i++)
{
if (chars[i] != '\0')
{
chars[j] = chars[i];
j++;
}
}
for (int k = j; k < charIndex + count2; k++)
{
chars[k] = '\0';
}
return count2 + (i - j);
}
}
Note that this cheat is based on the fact that StreamReader.ReadLineAsync
uses only the GetChars()
. We remove the '\0' in the temporary buffer char[]
buffer used by StreamReader.ReadLineAsync
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With