Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do the new string types work in Delphi 2009/2010?

I have to convert a large legacy application to Delphi 2009 which uses strings, AnsiStrings, WideStrings and UTF8 data all over the place and I have a hard time to understand how the new string types work and how they should be used.

The application fully supported Unicode using TntUnicodeControls and there are 3rd party DLLs which require strings in specific encodings, mostly UTF8 and UTF16, making the conversion task not as trivial as one would suspect.

I especially have problems with the C DLL calls and choosing the right type. I also get the impression that there are many implicit string conversions happening, because one of the DLL seems to always receive UTF-8 encoded strings, no matter how the Delphi string is encoded.

Can someone please provide a short overview about the new Delphi 2009 string types UnicodeString and RawByteString, perhaps some usage hints and possible pitfalls when converting a pre 2009 application?

like image 988
Daniel Rikowski Avatar asked Sep 09 '09 12:09

Daniel Rikowski


People also ask

What is a string in Delphi?

A string represents a sequence of characters. Delphi supports the following predefined string types. String types. Type. Maximum length.

What does #10 do in Delphi?

The "#13#10" part represents a carriage return + line feed combination. The "#13" is the ASCII equivalent of the CR (carriage return) value; #10 represents LF (line feed). Two more interesting control characters include: #0 — NULL character.


1 Answers

See Delphi and Unicode, a white paper written by Marco Cantù and I guess The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), written by Joel.

One pitfall is that the default Win32 API call has been mapped to use the W (wide string) version instead of the A (ANSI) version, for example ShellExecuteA If your code is doing tricky pointer code assuming internal layout of AnsiString, it will break. A fallback is to substitute PChar with PAnsiChar, Char with AnsiChar, string with AnsiString, and append A at the end of Win32 API call for that portion of code. After the code actually compiles and runs normally, you could refactor your code to use string (UnicodeString).

like image 187
Eugene Yokota Avatar answered Sep 27 '22 18:09

Eugene Yokota