Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delphi 7 and decode UTF-8 base64

In Delphi 7, I have a widestring encoded with Base64(That I received from a Web service with WideString result) :

PD94bWwgdmVyc2lvbj0iMS4wIj8+DQo8c3RyaW5nPtiq2LPYqjwvc3RyaW5nPg==

when I decoded it, that result is not UTF-8:

<?xml version="1.0"?>
<string>طھط³طھ</string>

But when I decoded it by base64decode.org, result is true :

<?xml version="1.0"?>
<string>تست</string>

I have use EncdDecd unit for DecodeString function.

like image 869
Mohamad Avatar asked Apr 16 '26 15:04

Mohamad


2 Answers

The problem you have is that you are using DecodeString. That function, in Delphi 7, treats the decoded binary data as being ANSI encoded. And the problem is that your text is UTF-8 encoded.

To continue with the EncdDecd unit you have a couple of options. You can switch to DecodeStream. For instance, this code will produce a UTF-8 encoded text file with your data:

{$APPTYPE CONSOLE}

uses
  Classes,
  EncdDecd;

const
  Data = 'PD94bWwgdmVyc2lvbj0iMS4wIj8+DQo8c3RyaW5nPtiq2LPYqjwvc3RyaW5nPg==';

var
  Input: TStringStream;
  Output: TFileStream;

begin
  Input := TStringStream.Create(Data);
  try
    Output := TFileStream.Create('C:\desktop\out.txt', fmCreate);
    try
      DecodeStream(Input, Output);
    finally
      Output.Free;
    end;
  finally
    Input.Free;
  end;
end.

Or you could continue with DecodeString, but then immediately decode the UTF-8 text to a WideString. Like this:

{$APPTYPE CONSOLE}

uses
  Classes,
  EncdDecd;

const
  Data = 'PD94bWwgdmVyc2lvbj0iMS4wIj8+DQo8c3RyaW5nPtiq2LPYqjwvc3RyaW5nPg==';

var
  Utf8: AnsiString;
  wstr: WideString;

begin
  Utf8 := DecodeString(Data);
  wstr := UTF8Decode(Utf8);
end.

If the content of the file can be represented in your application's prevailing ANSI locale then you can convert that WideString to a plain AnsiString.

var
  wstr: WideString;
  str: string; // alias to AnsiString
....
wstr := ... // as before
str := wstr;

However, I really don't think that using ANSI encoded text is going to lead to a very fruitful programming life. I encourage you to embrace Unicode solutions.

Judging by the content of the decoded data, it is XML. Which is usually handed to an XML parser. Most XML parsers will accept UTF-8 encoded data, so you quite probably can base64 decode to a memory stream using DecodeStream and then hand that stream off to your XML parser. That way you don't need to decode the UTF-8 to text and can let the XML parser deal with that aspect.

like image 138
David Heffernan Avatar answered Apr 18 '26 15:04

David Heffernan


As an addendum to David Heffernan's awesome answer, and Remy Lebeau's note on how it's broken on Delphi 7, I would like to add a function that will help any developer stuck on Delphi 7.

Since UTF8Decode() is broken in Delphi 7, I found a function in a forum that solved my problem:

function UTF8ToWideString(const S: AnsiString): WideString;
var
  BufSize: Integer;
begin
  Result := '';
  if Length(S) = 0 then Exit;
  BufSize := MultiByteToWideChar(CP_UTF8, 0, PAnsiChar(S), Length(S), nil, 0);
  SetLength(result, BufSize);
  MultiByteToWideChar(CP_UTF8, 0, PANsiChar(S), Length(S), PWideChar(Result), BufSize);
end;

So now, you can use DecodeString, and then decode the UTF-8 text to a WideString using this function:

begin
  Utf8 := DecodeString(Data);
  wstr := UTF8ToWideString(Utf8);
end.
like image 42
Tarrakis Avatar answered Apr 18 '26 16:04

Tarrakis