Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to write unicode hello world in C on windows

im tyring to get this to work:


#define UNICODE
#define _UNICODE
#include <wchar.h>

int main()
{
    wprintf(L"Hello World!\n");
    wprintf(L"£안, 蠀, ☃!\n");
    return 0;
}

using visual studio 2008 express (on windows xp, if it matters). when i run this from the command prompt (started as cmd /u which is supposed to enable unicode ?) i get this:

C:\dev\unicodevs\unicodevs\Debug>unicodevs.exe
Hello World!
┬ú∞
C:\dev\unicodevs\unicodevs\Debug>

which i suppose was to be expected given that the terminal does not have the font to render those. but what gets me is that even if i try this:

C:\dev\unicodevs\unicodevs\Debug>cmd /u /c "unicodevs.exe > output.txt"

the file produced (even though its UTF-8 encoded) looks like:

Hello World!
壓

the source file itself is defined as unicode (encoded in UTF-8 without BOM). the compiler output when building:

1>------ Rebuild All started: Project: unicodevs, Configuration: Debug Win32 ------
1>Deleting intermediate and output files for project 'unicodevs', configuration 'Debug|Win32'
1>Compiling...
1>main.c
1>.\main.c(1) : warning C4005: 'UNICODE' : macro redefinition
1>        command-line arguments :  see previous definition of 'UNICODE'
1>.\main.c(2) : warning C4005: '_UNICODE' : macro redefinition
1>        command-line arguments :  see previous definition of '_UNICODE'
1>Note: including file: C:\Program Files\Microsoft Visual Studio 9.0\VC\include\wchar.h
1>Note: including file:  C:\Program Files\Microsoft Visual Studio 9.0\VC\include\crtdefs.h
1>Note: including file:   C:\Program Files\Microsoft Visual Studio 9.0\VC\include\sal.h
1>C:\Program Files\Microsoft Visual Studio 9.0\VC\include\sal.h(108) : warning C4001: nonstandard extension 'single line comment' was used
1>Note: including file:   C:\Program Files\Microsoft Visual Studio 9.0\VC\include\crtassem.h
1>Note: including file:   C:\Program Files\Microsoft Visual Studio 9.0\VC\include\vadefs.h
1>Note: including file:  C:\Program Files\Microsoft Visual Studio 9.0\VC\include\swprintf.inl
1>Note: including file:  C:\Program Files\Microsoft Visual Studio 9.0\VC\include\wtime.inl
1>Linking...
1>Embedding manifest...
1>Creating browse information file...
1>Microsoft Browse Information Maintenance Utility Version 9.00.30729
1>Copyright (C) Microsoft Corporation. All rights reserved.
1>Build log was saved at "file://c:\dev\unicodevs\unicodevs\unicodevs\Debug\BuildLog.htm"
1>unicodevs - 0 error(s), 3 warning(s)
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========

any ideas on what am i doing wrong ? similar questions on ST (like this one: unicode hello world for C?) seem to refer to *nix builds - as far as i understand setlocale() is not available for windows.

i also tried building this using code::blocks/mingw gcc, but got the same results.

like image 779
radai Avatar asked Mar 30 '10 06:03

radai


1 Answers

It's not the writing (wprintf) that's the problem, it's the cmd redirection of output that's causing the problem. You can try testing by writing directly to file instead. In that case, you might then run into notepad (or rather Windows API function) not guessing correctly and interpreting your text as ASCII incorrectly if you're just writing a couple of words. In which case, you'll need to write the BOM characters into the file first as well.

#include <stdio.h>
#include <wchar.h>

int main()
{
    FILE *out;
    char bom[] = "\xFF\xFE";
    wchar_t s[] = L"中文!";
    size_t c;

    out = fopen ("out.txt", "w");
    if(out == NULL)
    {
        perror("out.txt");
        return 1;
    }

    c = fwrite(bom, 1, 2, out);
    if(c != 2)
    {
        perror ("Fatal write error.");
        fclose(out);
        return 2;
    }

    c = fwrite(s, sizeof(wchar_t), wcslen(s), out);
    if(c != wcslen(s))
    {
        perror ("Fatal write error.");
        fclose(out);
        return 2;
    }

    fclose(out);

    return 0;
}
like image 85
KTC Avatar answered Oct 23 '22 15:10

KTC