As the experts have been kindly suggested, TStringStream.DataString
cannot be used to retrieve non-text
data loaded by TStringStream.LoadFromFile
, because TStringStream.GetDataString
will call TEncoding
's encoding methods, which, take TMBCSEncoding
for example, will call TMBCSEncoding.GetChars
which in turn calls TMBCSEncoding.UnicodeFromLocaleChars
and finally Windows
's MultiByteToWideChar
.
TBytes is recommended to be used as data buffer/binary storage. (For this purpose, TBytes is recommended over AnsiString.)
The bytes
can be retrieved from TStringStream.ReadBuffer
method or TStringStream.Bytes
property. Either way, TStream.Size
should be considered.
====================================================
I am trying to use TStringStream
and its DataString
for base64-encoding/decoding purposes. It seems possible as indicated by Nils Haeck
's reply here or here.
Using TStringStream.DataString
in TMainForm.QuestionOfString_StringStream
(No.2 to No.7) fail in that the information is corrupted (i.e., not the same as the original information). However, ss_loaded_2.SaveToFile
(No.1) saves the original information, indicating TStringStream
does hold decoded non-textual data correctly internally? Could you help to comment about the possible reasons of DataString corruption?
In Rob Kennedy
's kind answer, he mentioned string
or ansistring
should be avoid in storing base64-decoded non-textual data, which makes great sense. However, as shown in TMainForm.QuestionOfString_NativeXML
, the DecString
of AnsiString
type contains the decoded bytes so correctly that the data can be encoded back. Does this mean AnsiString can hold decoded non-texual data intact?
David Heffernan
and Rob Kennedy
have kindly commented about bytes/TBytes. However, bytes
extracted in TMainForm.QuestionOfString_NativeXML_Bytes_1
, is different from TStringStream
's Bytes
in TMainForm.QuestionOfString_NativeXML_Bytes_2
. (From Base64-encoding/decoding results, the TStringStream.Bytes
is wrong. It is confusing because based on the above paragraph, TStringStream
should contain the intact bytes internally?) Could you help to comment about the possible reason?
Thank you very much for your help!
PS: The sample files can be download from SkyDrive: REF_EncodedSample & REF_DecodedSample. (Zlib-compressed image file.).
PS: Delphi XE, Windows 7. (It seems TStringStream back in Delphi 7 doesn't have LoadFromFile or SaveToFile.)
unit uMainForm;
interface
uses
CodeSiteLogging,
NativeXml, // v3.10
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs;
type
TMainForm = class(TForm)
procedure FormCreate(Sender: TObject);
private
{ Private declarations }
procedure QuestionOfString_StringStream;
procedure QuestionOfString_NativeXML;
procedure QuestionOfString_NativeXML_Bytes_1;
procedure QuestionOfString_NativeXML_Bytes_2;
public
{ Public declarations }
end;
var
MainForm: TMainForm;
implementation
{$R *.dfm}
// http://stackoverflow.com/questions/773297/how-can-i-convert-tbytes-to-rawbytestring
function Convert(const Bytes: TBytes): RawByteString;
begin
SetLength(Result, Length(Bytes));
if Length(Bytes) > 0 then
begin
Move(Bytes[0], Result[1], Length(Bytes));
// SetCodePage(Result, CP_ACP, False);
end;
end;
procedure TMainForm.FormCreate(Sender: TObject);
begin
QuestionOfString_StringStream;
QuestionOfString_NativeXML;
QuestionOfString_NativeXML_Bytes_1;
QuestionOfString_NativeXML_Bytes_2;
end;
// http://www.delphigroups.info/2/3/321962.html
// http://borland.newsgroups.archived.at/public.delphi.graphics/200712/0712125679.html
procedure TMainForm.QuestionOfString_StringStream;
var
ss_loaded_2, ss_loaded_3: TStringStream;
dataStr: AnsiString;
hexOfDataStr: AnsiString;
begin
ss_loaded_2 := TStringStream.Create();
// load the file containing Base64-decoded sample data
ss_loaded_2.LoadFromFile('REF_DecodedSample');
// 1
ss_loaded_2.SaveToFile('REF_DecodedSample_1_SavedByStringStream');
// 2
ss_loaded_3 := TStringStream.Create(ss_loaded_2.DataString);
ss_loaded_3.SaveToFile('REF_DecodedSample_2_SavedByStringStream');
// 3
ss_loaded_3.Free;
ss_loaded_3 := TStringStream.Create(ss_loaded_2.DataString, TEncoding.ASCII);
ss_loaded_3.SaveToFile('REF_DecodedSample_3_SavedByStringStream');
// 4
ss_loaded_3.Free;
ss_loaded_3 := TStringStream.Create(ss_loaded_2.DataString, TEncoding.UTF8);
ss_loaded_3.SaveToFile('REF_DecodedSample_4_SavedByStringStream');
// 5
ss_loaded_3.Free;
ss_loaded_3 := TStringStream.Create(AnsiString(ss_loaded_2.DataString));
ss_loaded_3.SaveToFile('REF_DecodedSample_5_SavedByStringStream');
// 6
ss_loaded_3.Free;
ss_loaded_3 := TStringStream.Create(UTF8String(ss_loaded_2.DataString));
ss_loaded_3.SaveToFile('REF_DecodedSample_6_SavedByStringStream');
// 7
dataStr := ss_loaded_2.DataString;
SetLength(hexOfDataStr, 2 * Length(dataStr));
BinToHex(@dataStr[1], PAnsiChar(@hexOfDataStr[1]), Length(dataStr));
CodeSite.Send(hexOfDataStr);
ss_loaded_2.Free;
ss_loaded_3.Free;
end;
// http://www.simdesign.nl/forum/viewtopic.php?f=2&t=1311
procedure TMainForm.QuestionOfString_NativeXML;
var
LEnc, LDec: integer;
EncStream: TMemoryStream;
DecStream: TMemoryStream;
EncString: AnsiString;
DecString: AnsiString;
begin
// encode and decode streams
EncStream := TMemoryStream.Create;
DecStream := TMemoryStream.Create;
try
// load BASE64-encoded data
EncStream.LoadFromFile('REF_EncodedSample');
LEnc := EncStream.Size;
SetLength(EncString, LEnc);
EncStream.Read(EncString[1], LEnc);
// decode BASE64-encoded data, after removing control chars
DecString := DecodeBase64(sdRemoveControlChars(EncString));
LDec := length(DecString);
DecStream.Write(DecString[1], LDec);
// save the decoded data
DecStream.SaveToFile('REF_DecodedSample_7_SavedByNativeXml');
// EncString := sdAddControlChars(EncodeBase64(DecString), #$0D#$0A);
EncString := EncodeBase64(DecString);
// clear and resave encode stream as a copy
EncStream.Clear;
EncStream.Write(EncString[1], Length(EncString));
EncStream.SaveToFile('REF_EncodedSampleCopy');
finally
EncStream.Free;
DecStream.Free;
end;
end;
procedure TMainForm.QuestionOfString_NativeXML_Bytes_1;
var
LEnc, LDec: integer;
EncStream: TMemoryStream;
DecStream: TMemoryStream;
EncString: AnsiString;
DecString: AnsiString;
DecBytes: TBytes;
begin
// encode and decode streams
EncStream := TMemoryStream.Create;
DecStream := TMemoryStream.Create;
try
// load BASE64-decoded data
DecStream.LoadFromFile('REF_DecodedSample');
LDec := DecStream.Size;
SetLength(DecBytes, LDec);
DecStream.Read(DecBytes[0], LDec);
EncString := EncodeBase64(Convert(DecBytes));
// clear and resave encode stream as a copy
EncStream.Write(EncString[1], Length(EncString));
EncStream.SaveToFile('REF_EncodedSampleCopy_Bytes_1');
finally
EncStream.Free;
DecStream.Free;
end;
end;
procedure TMainForm.QuestionOfString_NativeXML_Bytes_2;
var
LEnc, LDec: integer;
EncStream: TMemoryStream;
DecStream: TStringStream;
EncString: AnsiString;
DecString: AnsiString;
DecBytes: TBytes;
begin
// encode and decode streams
EncStream := TMemoryStream.Create;
DecStream := TStringStream.Create;
try
// load BASE64-decoded data
DecStream.LoadFromFile('REF_DecodedSample');
DecBytes := DecStream.Bytes;
EncString := EncodeBase64(Convert(DecBytes));
// clear and resave encode stream as a copy
EncStream.Write(EncString[1], Length(EncString));
EncStream.SaveToFile('REF_EncodedSampleCopy_Bytes_2');
finally
EncStream.Free;
DecStream.Free;
end;
end;
end.
It's really no surprise that examples 3 through 7 fail. Your file is not textual data, so storing it in a text data structure is bound to show problems. Each of those tests involves converting the data from one encoding to another. Since your data isn't encoded as UTF-16 text to begin with, any conversion that expects the data to have that encoding is going to fail.
Example 2 probably fails because you have an odd number of bytes, and you're storing it in a string that by definition contains an even number of bytes. Somewhere, a byte is going to be introduced or dropped, causing different data to be stored.
Unless you're dealing with text, don't use TStringStream
, string
, or AnsiString
. Try TBytesStream
or TMemoryStream
instead.
Feel free to store Base64-encoded data in a string. Base64 is a text format. But once you decode it, it's binary again, and has no business being in a text data structure anymore.
The reason you see different results now from what Nils Haeck suggested you should expect is that Haeck was writing in 2007, before Delphi strings became Unicode and the RTL did any automatic code-page conversions. You're using Delphi XE, where string
is UnicodeString
.
You are not taking into account that TStringStream
derives from TMemoryStream
and TByteStream
in D2009+ but derived directly from TStream
in earlier versions. TMemoryStream
allocates memory differently than your code is expecting, and the TByteStream.Bytes
property represents the entire memory block that TMemoryStream
allocates, but that does not mean that the entire contents of that memory is filled in with actual data. There is some extra padding involved that your code needs to ignore.
See my answer to your other question for a more detailed explanation as to why your code is failing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With