I finally upgraded to Delphi XE. I have a library of units where I use strings to store plain ANSI characters (chars between A and U). I am 101% sure that I will never ever use UNICODE characters in those places.
I want to convert all other libraries to Unicode, but for this specific library I think it will be better to stick with ANSI. The advantage is the memory requirement as in some cases I load very large TXT files (containing ONLY Ansi characters). The disadvantage might be that I have to do lots and lots of typecasts when I make those libraries to interact with normal (unicode) libraries.
There are some general guidelines to show when is good to convert to Unicode and when to stick with Ansi?
The AnsiString type is a long, reference-counted string containing AnsiChar characters. By default, Delphi treats the generic string type as synonymous with AnsiString . If you use the $H- or $LongStrings compiler directives, though, string becomes the same as ShortString .
Description. The Char type represents a single character, just as it does in standard Pascal. You can cast any integer type to a character by using the Char type name in a type cast. Unlike some other type casts, the size of the integer value does not have to match the size of a Char.
The problem with general guidelines is that something like this can be very specific to a person's situation. Your example here is one of those.
However, for people Googling and arriving here, some general guidelines are:
Yes, convert to Unicode. Don't try to keep an old app fully using AnsiString
s. The reason is that the whole VCL is Unicode, and you shouldn't try to mix the two, because you will convert every time you assign a Unicode string to an ANSI string, and that is a lossy conversion. Trying to keep the old way because it's less work (or some similar reason) will cause you pain; just embrace the new string
type, convert, and go with it.
Instead of randomly mixing the two, explicitly perform any conversions you need to, once - for example, if you're loading data from an old version of your program you know it will be ANSI, so read it into a Unicode string there, and that's it. Ever after, it will be Unicode.
You should not need to change the type of your string
variables - string
pre-D2009 is ANSI, and in D2009 and alter is Unicode. Instead, follow compiler warnings and watch which string methods you use - some still take an AnsiString
parameter and I find it all confusing. The compiler will tell you.
If you use strings to hold bytes (in other words, using them as an array of bytes because a character was a byte) switch to TBytes
.
You may encounter specific problems for things like encryption (strings are no longer byte/characters, so 'character' for 'character' you may get different output); reading text files (use the stream classes and TEncoding); and, frankly, miscellaneous stuff. Search here on SO, most things have been asked before.
Commenters, please add more suggestions... I mostly use C++Builder, not Delphi, and there are probably quite a few specific things for Delphi I don't know about.
Now for your specific question: should you convert this library?
If:
then not converting to Unicode, and instead switching your string
s to AnsiString
s, makes sense.
Be aware that:
UTF8String
, which is a specific type of AnsiString
that will not be lossy when converted, and will still store most text (Roman characters) in a single bytestring
to AnsiString
could be a bit of work, and you will need to check all the methods called with them to see if too many implicit conversions are being performed (for performance), etcif 'S' in MySet
?) won't work. From your description of characters A to U, I could guess you would like to use this syntax.My recommendation? Personally, the only reason I would do this from the information you've given is the memory use, and possibly performance depending on what you're doing with this huge amount of A..U
s. If that truly is significant, it's both the driver and the constraint, and you should convert to ANSI.
You should be able to wrap up the conversion at the interface between this unit and its clients. Use AnsiString internally and string everywhere else and you should be fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With