Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delphi2010: Writing code to assign Caption containing Unicode literal values or load unicode symbols from text file?

How to make a Unicode program in Delphi 2010?

I have English Windows and "Current language for non-Unicode programs" is English too. Static controls look good but if I try to change them (Label.Caption := 'unicode value' or Memo.LoadFromFile(textFilename) ) the text looks like: $^$&%*(#&#.

How to fix it?

like image 615
Michael Avatar asked Jun 07 '11 19:06

Michael


People also ask

How do you code Unicode?

Inserting Unicode characters To insert a Unicode character, type the character code, press ALT, and then press X.

How do I write Unicode in Word?

Inserting Unicode CharactersType the character code where you want to insert the Unicode symbol. Press ALT+X to convert the code to the symbol. If you're placing your Unicode character immediately after another character, select just the code before pressing ALT+X.

What is a Unicode character code?

The Unicode character encoding standard is a fixed-length, character encoding scheme that includes characters from almost all of the living languages of the world. Information about Unicode can be found in The Unicode Standard , and from the Unicode Consortium website at www.unicode.org.


1 Answers

Welcome to StackOverflow. Please post your code when you have such a problem. I will explain the most likely sources of the problem like the one you are seeing, but I can't help you fix it if you don't post your code. Also I have to make a lot of assumptions because you've asked me to guess almost everything about your question, which is why it got closed. I hope you give more detail in the future, and we can avoid closed questions.

Let me assume a bunch of things because you haven't given me very much data to go on.

  1. You have used Delphi before, and you know about the fundamental type names like String, Char, and so on.

  2. You may not be aware of the Unicode differences between Delphi 2007 (char=Ansichar/string=Ansistring) and Delphi 2009-or-later (including Delphi 2010 and XE) where Char=UnicodeChar, and String=UnicodeString.

  3. The most common reason you would see garbage (represented in your question as the text looks like $^$&%*(#&#.") is if you have tried to directly manipulate byte-size AnsiCharacter data and coerce it in a wrong way into a UnicodeString.

  4. MJN also noticed, from one of your comments that you are also having trouble with source code that contains unicode characters that was not saved as a UTF8 file... When I try to put Unicode characters into a source file Delphi automatically asks me this question, which I assume you also see, and answer correctly (the correct answer is yes)... But your question doesn't mention this at all, you really should try to update your question to specify the source of your problem.

enter image description here

Here is the right-click file format menu, from which you can change the encoding at any time, the recommended value is UTF8 as shown here:

enter image description here

You should definitely post the affected code that generates the incorrect string values. You should start, not with a giant application that you are trying to port to Unicode Delphi (which is the fourth and largest assumption I'm making here) and rather, start with some small sample code.

here's an example of "badly written code", that happens to still work in Delphi 7, because each character is one byte in size, but this assumption does not travel upwards to 2009 and XE Delphi:

procedure Tform1.TestBad;
var
 x:PAnsiChar;
 s:String;
begin
  x := 'test';
  s := Copy(PChar(x),1,10);
  Self.Caption := s;
end;

here's the same contrived sample code "fixed" (more like not-intentionally broken) so it will at least work in delphi XE:

procedure Tform1.TestLessBad;
var
 x:PAnsiChar;
 s:String;
begin
  x := 'test';
  s := Copy(x,1,10);
  Self.Caption := s;
end;

The use of pointers above is contrived, and unnecessary, except that I am trying to teach with this example.

The first example will create unicode chinese characters in the caption of the form instead of showing the text 'test', because 2 bytes have become a single character because I have intentionally done something BAD to show you one easy way to generate this noise you speak of, by making mistakes in my code.

If you are having trouble with particular unicode code-points, let me suggest you try this notation:

c := Char($21CC);  // this is U+21CC (cool two arrows thingy used in chemistry to indicate a reversible reaction)

Alternatively you will see this, which is almost the same thing:

c := #$21CC; // U+21CC

Notice how you don't need a UTF8 encoded file to store things you write this way.

like image 122
Warren P Avatar answered Sep 18 '22 06:09

Warren P