I use IN a lot in my project and I have lots of these warnings:
[DCC Warning] Unit1.pas(40): W1050 WideChar reduced to byte char in set expressions. Consider using CharInSet function in SysUtils unit.
I made a quick test and using CharInSet instead of IN is from 65%-100% slower:
if s1[i] in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] then
vs
if CharInSet(s1[i], ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']) then
Here is code for 2 tests, one works with loop through shorter strings, one loops once through a large string:
Adding 2 buttons on form I tested this for short string:
procedure TForm1.Button1Click(Sender: TObject);
var s1: string;
t1, t2: TStopWatch;
a, i, cnt, vMaxLoop: Integer;
begin
s1 := '[DCC Warning] Unit1.pas(40): W1050 WideChar reduced to byte char in set expressions. Consider using CharInSet function in SysUtils unit.';
vMaxLoop := 10000000;
cnt := 0;
t1 := TStopWatch.Create;
t1.Start;
for a := 1 to vMaxLoop do
for i := 1 to Length(s1) do
if s1[i] in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] then
inc(cnt);
t1.Stop;
cnt := 0;
t2 := TStopWatch.Create;
t2.Start;
for a := 1 to vMaxLoop do
for i := 1 to Length(s1) do
if CharInSet(s1[i], ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']) then
inc(cnt);
t2.Stop;
Button1.Caption := inttostr(t1.ElapsedMilliseconds) + ' - ' + inttostr(t2.ElapsedMilliseconds);
end;
And this for 1 long string:
procedure TForm1.Button2Click(Sender: TObject);
var s1: string;
t1, t2: TStopWatch;
a, i, cnt, vMaxLoop: Integer;
begin
s1 := '[DCC Warning] Unit1.pas(40): W1050 WideChar reduced to byte char in set expressions. Consider using CharInSet function in SysUtils unit.';
s1 := DupeString(s1, 1000000);
s1 := s1 + s1 + s1 + s1; // DupeString is limited, use this to create longer string
cnt := 0;
t1 := TStopWatch.Create;
t1.Start;
for i := 1 to Length(s1) do
if s1[i] in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] then
inc(cnt);
t1.Stop;
cnt := 0;
t2 := TStopWatch.Create;
t2.Start;
for i := 1 to Length(s1) do
if CharInSet(s1[i], ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']) then
inc(cnt);
t2.Stop;
Button2.Caption := inttostr(t1.ElapsedMilliseconds) + ' - ' + inttostr(t2.ElapsedMilliseconds);
end;
Why do they recommend slower option, or how can I fix this warning without penalty in performance?
The warning is telling you that your code may be defective. Because sets can only be based on types with ordinality of 256 or less, the base type is truncated to that size. Now, Char
is an alias for WideChar
and has ordinality 65536. So the warning is there to tell you that your program may not behave as you expect. For instance, one might ask what this expression evaluates to:
['A', chr(256)] = ['A']
One might expect it to evaluate false, but in fact it evaluates true. So I think you should certainly take heed of the compiler when it issues this warning.
Now, it so happens that your set, which can and should be written more concisely as ['A'..'Z']
, is made up entirely of ASCII characters. And it happens (thanks to commentors Andreas and ventiseis) that in that case the compiler generates correct code for such a set, regardless of the ordinal value of the character to the left of the in
operator. So
if s1[i] in ['A'..'Z'] then
will result in correct code, in spite of the warning. And the compiler is able to detect that the set's elements are contiguous and generate efficient code.
Note that this does depend on the set being a literal and so the optimisation can be performed by the compiler. And that is why it can perform so much better than CharInSet
. Because CharInSet
is a function, and the Delphi optimiser has limited power, CharInSet
is not able to take advantage of the contiguous nature of this specific set literal.
The warning is annoying though, and do you really want to rely on remembering the very specific details of when this warning can safely be ignored. Another way to implement the test, and sidestep this warning is to use inequality operators:
if (c >= 'A') and (c <= 'Z') then
....
You'd probably wrap this in an inlined function to make the code even easier to read.
function IsUpperCaseEnglishLetter(c: Char): Boolean; inline;
begin
Result := (c >= 'A') and (c <= 'Z');
end;
You should also ask yourself whether or not this code is a performance bottleneck. You should time your real program rather than such an artificial program. I'll bet that this code isn't a bottleneck and if so you should not treat performance as the key driver.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With