Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

g++ unicode variable name

I am trying to use unicode variable names in g++.

It does not appear to work.

Does g++ not support unicode variable names, ... or is there some subset of unicode (from which I'm not testing in).

Thanks!

like image 880
anon Avatar asked Apr 21 '10 09:04

anon


2 Answers

You have to specify the -fextended-identifiers flag when compiling, you also have to use \uXXXX or \uXXXXXXXX for unicode(atleast in gcc it's unicode)

Identifiers (variable/class names etc) in g++ can't be of utf-8/utf-16 or whatever encoding, they have to be:

identifier:
  nondigit
  identifier nondigit
  identifier digit

a nondigit is

nondigit: one of
  universalcharactername
  _ a b c d e f g h i j k l m n o p q r s t u v w x y z
  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

and a universalcharactername is

universalcharactername:
  \UXXXXXXXX
  \uXXXX

Thus, if you save your source file as UTF-8, you cannot have a variable like e.g.:

int høyde = 10;

it had to be written like:

int h\u00F8yde = 10;

(which imo would beat the whole purpose - so just stick with a-z)

like image 131
nos Avatar answered Dec 14 '22 20:12

nos


A one-line patch to the cpp preprocessor allows UTF-8 input. Details for gcc are given at

https://www.raspberrypi.org/forums/viewtopic.php?p=802657

however, since the preprocessor is shared, the same patch should work for g++ as well. In particular, the patch needed, as of gcc-5.2 is

diff -cNr gcc-5.2.0/libcpp/charset.c gcc-5.2.0-ejo/libcpp/charset.c
*** gcc-5.2.0/libcpp/charset.c  Mon Jan  5 04:33:28 2015
--- gcc-5.2.0-ejo/libcpp/charset.c  Wed Aug 12 14:34:23 2015
***************
*** 1711,1717 ****
    struct _cpp_strbuf to;
    unsigned char *buffer;

!   input_cset = init_iconv_desc (pfile, SOURCE_CHARSET, input_charset);
    if (input_cset.func == convert_no_conversion)
      {
        to.text = input;
--- 1711,1717 ----
    struct _cpp_strbuf to;
    unsigned char *buffer;

!   input_cset = init_iconv_desc (pfile, "C99", input_charset);
    if (input_cset.func == convert_no_conversion)
      {
        to.text = input;

Note that for the above patch to work, a recent version of iconv needs to be installed that supports C99 conversions. Type iconv --list to verify this, otherwise, you can install a new version of iconv along with gcc as described in the link above. Change the configure command to

$ ../gcc-5.2.0/configure -v --disable-multilib \
    --with-libiconv-prefix=/usr/local/gcc-5.2 \
    --prefix=/usr/local/gcc-5.2 \
    --enable-languages="c,c++"

if you are building for x86 and want to include the c++ compiler as well.

like image 24
ejolson Avatar answered Dec 14 '22 21:12

ejolson