In a function that reads data (data meaning exclusively strings) from disk, which should I prefer? Which is better?
A) DiskStream.Read(Pointer(s)^, Count)
or
B) DiskStream.Read(s[1], Count)
Note:
I know both are having the same result.
I know that I have to SetLength of S before calling Read.
UPDATE
S is AnsiString.
Here is the full function:
{ Reads a bunch of chars from the file. Why 'ReadChars' and not 'ReadString'? This function reads C++ strings (the length of the string was not written to disk also). So, i have to give the number of chars to read as parameter. }
function TMyStream.ReadChars(out s: AnsiString; CONST Count: Longint): Boolean;
begin
SetLength(s, Count);
Result:= Read(s[1], Count)= Count;
end;
Speed test
In my speed test the first approach was a tiny bit faster than the second one. I used a 400MB file from which I read strings about 200000 times. The process was set to High priority.
The best read time ever was:
1.35 for variant B and 1.37 for variant A.
Average:
On average, B was scoring also 20ms better than A.
The test was repeated 15 times for each variant.
The difference is really small. It could fall into the measuring error range. Probably it will be significant if I read strings more often and from a bigger file. But for the moment let's say that both lines of code are performing the same.
ANSWER
Variant A - might be a tiny tiny bit faster
Variant B - is (obviously) much more easier to read and it is more Delphi-ish. My preferred.
Note:
I have seen Embarcadero using variant A in TStreamReadBuffer example, but with a TBytes instead of String.
Definitely the array notation. Part of Delphi style is to make your code easy to read, and it's easier to tell what's going on when you spell out exactly what you're doing. Casting a string to a pointer and then dereferencing it looks confusing; why are you doing that? It doesn't make sense unless the reader knows a lot about string internals.
Be aware that when running
1. DiskStream.Read(Pointer(s)^, Count)
2. DiskStream.Read(s[1], Count)
The 1. version will be faster.
But you must be sure that the s
variable is explicitly local, or you have called yourself UniqueString(s)
before the loop.
Since pointer(s)^
won't call UniqueString?()
low-level hidden RTL call, it will be faster than s[1]
, but you may override some existing data if the s
string variable is shared between the current context and other context (e.g. if the last content of s
was retrieved from a function from a property value, or s
is sent as parameter to another method).
In fact the fastest correct way of coding this reading an AnsiString
from content is:
s := '';
SetLength(s,Count);
DiskStream.Read(pointer(s)^,Count);
or
SetString(s,nil,Count);
DiskStream.Read(pointer(s)^,Count);
The 2nd version being equal to the 1st, but with one line less.
Setting s
to '' will call FreeMem()+AllocMem()
instead of ReallocMem()
in SetLength()
, so will avoid a call to move()
, and will be therefore a bit faster.
In fact, the UniqueString?()
RTL call generated by s[1]
will be very fast, since you have already called SetLength()
before calling it: therefore, s
is already unique, and UniqueString?()
RTL call will return almost immediately. After profiling, there is not much speed difference between the two versions: almost all time is spend in string allocation and content moving from disk. Perhaps s[1]
is found to be more "pascalish".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With