I decided to turn my win32 c++ application into Unicode version but when I use that i got unreadable letters for Arabic, Chinese and Japanese...
First:
If I don't use Unicode I got Arabic ok in edit boxes Window titles:
HWND hWnd = CreateWindowEx(WS_EX_CLIENTEDGE, "Edit", "ا ب ت ث ج ح خ د ذ", WS_CHILD | WS_VISIBLE | WS_BORDER | ES_MULTILINE, 10, 10, 300, 200, hWnd, (HMENU)100, GetModuleHandle(NULL), NULL);
SetWindowText(hWnd, "صباح الخير");
The output seems ok and works fine! (without unicode).
I added before inclusion headers:
#define UNICODE
#include <windows.h
Now in Window Procedure:
case WM_CREATE:{
HWND hEdit = CreateWindowExW(WS_EX_CLIENTEDGE, L"Edit", L"ا ب ت ث ج ح خ د ذ", WS_CHILD | WS_VISIBLE | WS_BORDER | ES_MULTILINE, 10, 10, 300, 200, hWnd, (HMENU)100, GetModuleHandle(NULL), NULL);
// Even I send message to change text but I get unreadable characters!
}
break;
case WM_LBUTTONDBLCLK:{
SendDlgItemMessageW(hWnd, 100, WM_SETTEXT, 0, (LPARAM)L"السلام عليكم"); // Get unreadable characters also
}
break;
ِAs you can see with Unicode the controls cannot display Arabic characters correctly.
backspace
Now If I enter an Arabic text manually It succeeds to display it correctly?!!! But why Wen using Functions? Like SetWindowTextW()
??Please Help. Thank you.
Make sure to save the source file as UTF-16 or UTF-8 with BOM. Many Windows applications assume the ANSI encoding (default localized Windows code page) otherwise. You can also check compiler switches to force using UTF-8 for source files. For example, MS Visual Studio 2015's compiler has a /utf-8
switch so saving with BOM is not required.
Here's a simple example saved in UTF-8, and then UTF-8 w/ BOM and compiled with the Microsoft Visual Studio compiler. Note that there is no need to define UNICODE if you hard-code the W versions of the APIs and use L"" for wide strings:
#include <windows.h>
int main()
{
MessageBoxW(NULL,L"ا ب ت ث ج ح خ د ذ",L"中文",MB_OK);
}
Result (UTF-8). The compiler assumed ANSI encoding (Windows-1252) and decoded the wide string incorrectly.
Result (UTF-8 w/ BOM). The compiler detects the BOM and uses UTF-8 to decode the source code, resulting in the correct data generated for the wide strings.
A little Python code demonstrating the decode error:
>>> s='中文,ا ب ت ث ج ح خ د ذ'
>>> print(s.encode('utf8').decode('Windows-1252'))
ä¸æ–‡,ا ب ت Ø« ج Ø Ø® د Ø°
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With