Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting an AnsiString to a Unicode String

I'm converting a D2006 program to D2010. I have a value stored in a single byte per character string in my database and I need to load it into a control that has a LoadFromStream, so my plan was to write the string to a stream and use that with LoadFromStream. But it did not work. In studying the problem, I see an issue that tells me that I don't really understand how conversion from AnsiString to Unicode string works. Here is a piece of standalone code that illustrates the issue I am confused by:;

procedure TForm1.Button1Click(Sender: TObject); {$O-}
var
  sBuffer: String;
  oStringStream: TStringStream;
  sAnsiString: AnsiString;
  sUnicodeString: String;
  iSize1,
  iSize2: Word;
begin
  sAnsiString := '12345';
  oStringStream := TStringStream.Create(sBuffer);
  sUnicodeString := sAnsiString;
  iSize1 := StringElementSize(sAnsiString);
  iSize2 := StringElementSize(sUnicodeString);
  oStringStream.WriteString(sUnicodeString);
end;

If you break on the last line, and inspect the Bytes property of oStringStream, you will see that it looks like this:

Bytes (49 {$31}, 50 {$32}, 51 {$33}, 52 {$34}, 53 {$35}

I was expecting that it might look something like

(49 {$31}, 00 {$00}, 50 {$32}, 00 {$00}, 51 {$33}, 00 {$00}, 
 52 {$34}, 00 {$00}, 53 {$35}, 00 {$00} ...

Apparently my expectations are in error. But then, how to convert an AnsiString to unicode?

I'm not getting the right results out of the LoadFromStream because it is reading from the stream two bytes at a time, but the data it is receiving is not arranged that way. What is it that I should do to give the LoadFromStream a well formed stream of data based on a unicode string?

Thank you for your help.

like image 472
jrodenhi Avatar asked Apr 01 '10 00:04

jrodenhi


2 Answers

What is the type of the oStringStream.WriteString's parameter? If it is AnsiString, you have an implicit conversion from Unicode to Ansi and that explains your example.


Updated: Now the real question is how TStringStream stores data internally. In the following code sample (Delphi 2009)

procedure TForm1.Button1Click(Sender: TObject);
var
  S: string;
  SS: TStringStream;

begin
  S:= 'asdfg';
  SS:= TStringStream.Create(S);  // 1 byte per char
  SS.WriteString('321');
  Label1.Caption:= SS.DataString;
  SS.Free;
end;

TStringStream uses internally the default system ANSI encoding (1 byte per char). The constructor and WriteString procedures convert a string argument from unicode to ANSI.

To override this behaviour you must declare the encoding explicitely in the constructor:

procedure TForm1.Button1Click(Sender: TObject);
var
  S: string;
  SS: TStringStream;

begin
  S:= 'asdfg';
  SS:= TStringStream.Create(S, TEncoding.Unicode);  // 2 bytes per char
  SS.WriteString('321');
  Label1.Caption:= SS.DataString;
  SS.Free;
end;
like image 183
kludg Avatar answered Nov 08 '22 03:11

kludg


In Delphi last versions you could use TEncoding:

TEncoding.UTF8.GetString(TEncoding.ANSI.GetBytes(MyString))
like image 25
Rubén Pozo Avatar answered Nov 08 '22 03:11

Rubén Pozo