Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between WideChar and AnsiChar?

I'm upgrading some ancient (from 2003) Delphi code to Delphi Architect XE and I'm running into a few problems. I am getting a number of errors where there are incompatible types. These errors don't happen in Delphi 6 so I must assume that this is because things have been upgraded.

I honestly don't know what the difference between PAnsiChar and PWideChar is, but Delphi sure knows the difference and won't let me compile. If I knew what the differences were maybe I could figure out which to use or how to fix this.

like image 867
Daisetsu Avatar asked Feb 08 '11 23:02

Daisetsu


2 Answers

The short: prior to Delphi 2009 the native string type in Delphi used to be ANSI CHAR: Each char in every string was represented as an 8 bit char. Starting with Delphi 2009 Delphi's strings became UNICODE, using the UTF-16 notation: Now the basic Char uses 16 bits of data (2 bytes), and you probably don't need to know much about the Unicode code points that are represented as two consecutive 16 bits chars.

The 8 bit chars are called "Ansi Chars". An PAnsiChar is a pointer to 8 bit chars. The 16 bit chars are called "Wide Chars". An PWideChar is a pointer to 16 bit chars. Delphi knows the difference and does well if it doesn't allow you to mix the two!

More info

Here's a popular link on Unicode: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets

You can find some more information on migrating Delphi to Unicode here: New White Paper: Delphi Unicode Migration for Mere Mortals

You may also search SO for "Delphi Unicode migration".

like image 80
Cosmin Prund Avatar answered Sep 18 '22 23:09

Cosmin Prund


A couple years ago, the default character type in Delphi was changed from AnsiChar (single-byte variable representing an ANSI character) to WideChar (two-byte variable representing a UTF16 character.) The char type is now an alias to WideChar instead of AnsiChar, the string type is now an alias to UnicodeString (a UTF-16 Unicode version of Delphi's traditional string type) instead of AnsiString, and the PChar type is now an alias to PWideChar instead of PAnsiChar.

The compiler can take care of a lot of the conversions itself, but there are a few issues:

  1. If you're using string-pointer types, such as PChar, you need to make sure your pointer is pointing to the right type of data, and the compiler can't always verify this.
  2. If you're passing strings to var parameters, the variable type needs to be exactly the same. This can be more complicated now that you've got two string types to deal with.
  3. If you're using string as a convenient byte-array buffer for holding arbitrary data instead of a variable that holds text, that won't work as a UnicodeString. Make sure those are declared as RawByteString as a workaround.
  4. Anyplace you're dealing with string byte lengths, for example when reading or writing to/from a TStream, make sure your code isn't assuming that a char is one byte long.

Take a look at Delphi Unicode Migration for Mere Mortals for some more tricks and advice on how to get this to work. It's not as hard as it sounds, but it's not trivial either. Good luck!

like image 27
Mason Wheeler Avatar answered Sep 16 '22 23:09

Mason Wheeler