Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

StrLComp vs AnsiStrLComp when called with Unicode strings

I'm having a bit of confusion regarding the "Ansi" vs "regular" rtl string functions when called with Unicode strings. I understand that under older versions of Delphi (when Ansistring was the default) that the "Ansi" versions handled multibyte characters. Does this mean anything when dealing with Unicode strings? Assuming that I need to handle Korean characters and also that my code does not have to be compatible with older Delphi versions, which rtl functions should be used?

like image 395
MarkF Avatar asked Dec 12 '22 04:12

MarkF


1 Answers

The 'Ansi' prefix of the string compare functions really never signified anything other than that the locale was taken into account when comparing strings instead of doing "just" a simple binary comparison. In the Unicode world this is still the case. The Ansi* family of functions also take (Unicode) strings as their parameters and take the locale into account when doing the comparison.

From the AnsiCompareStr doc (D2009):

Most locales consider lowercase characters to be less than the corresponding uppercase characters. This is in contrast to ASCII order, in which lowercase characters are greater than uppercase characters. Thus, setting S1 to 'a' and S2 to 'A' causees AnsiCompareStr to return a value less than zero, while CompareStr, with the same arguments, returns a value greater than zero.

What the effect of "taking the locale into account" may be differs per locale. It may have to do with accented characters or not. In Unicode versions it may actually take into account how the characters are composed. For example an accented e (é) may be encoded exactly like that but may also be encoded as two separate items: the accent and the e.

Both the Ansi* and the "normal" string compare functions are included in the SysUtils unit. They all take strings as their parameters and in Unicode Delphi that does indeed mean UnicodeStrings.

If you need to work with AnsiStrings then you need to use the AnsiStrings unit. It has the same set of string compare functions, but in this unit they all take AnsiStrings as their parameters.

Now, if you don't need compatability with older versions: use the standard functions from SysUtils. Use the normale ones if byte comparison is enough. Use the Ansi ones if you need to take locale considerations into account.

like image 104
Marjan Venema Avatar answered Dec 28 '22 08:12

Marjan Venema