Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a reason for any C or C++ compiler to not define wctrans_t and wctype_t as the type wchar_t?

Tags:

c++

c

widechar

Actually, I'm working on a comparison of data types between programming languages, and here is my problem when reading the C and C++ standards.

Quoted from C11,

wctrans_t is a scalar type that can hold values which represent locale-specific character mappings

wctype_t is a scalar type that can hold values which represent locale-specific character classifications

The phrase a scalar type indicates that C11 does not restrict wctrans_t and wctype_t to be a specific scalar type.

My GCC 4.8 of MinGW implements wctrans_t and wctype_t as a typedef for wchar_t, and I can't think there is a reason for any other C compilers to not define them as it is.

Could somebody proof otherwise, or give a possibility for that to happen?

like image 590
Astaroth Avatar asked Aug 18 '14 14:08

Astaroth


2 Answers

Cubbi has already answered this question. Here a couple of additional informations, because the definition of the standard, is not really self-explaining.

A wctype_t represents locale-specific character classifications. So its not about characters, but about their classification (aka. the old isalpha(), isalnum(),..). The wctype_t values are used by the function iswctype() to test a wide character. Example (C11, section 7.30.2.2.1):

iswctype(wc, wctype("alnum")) // iswalnum(wc)
iswctype(wc, wctype("alpha")) // iswalpha(wc)
iswctype(wc, wctype("blank")) // iswblank(wc)
iswctype(wc, wctype("lower")) // iswlower(wc)
...

Similarly, a wctrans_t represent represent locale-specific character mappings. So it' not about a character code set, but it is mappings from one type of wide characters to a related tone (e.g. like the old toupper(), to lower(),...). The mappings are described in section 7.30.3 of C11 standard), here some examples:

towctrans(wc, wctrans("tolower")) // towlower(wc)
towctrans(wc, wctrans("toupper")) // towupper(wc)

The wchar_t definition that you mentions seems misleading to me, although, a wchar_t is an integer too.

Here the way it is defined in MSVC13:

typedef unsigned short wint_t;
typedef unsigned short wctype_t;
typedef wchar_t wctrans_t;     // yes, here too ! 
like image 20
Christophe Avatar answered Sep 17 '22 23:09

Christophe


I am surprised someone defined them as wchar_t, neither wctype_t nor wctrans_t have anything to do with characters.

Both platforms I use define them as something else:

aix~$ grep wctype_t /usr/include/*h | grep typedef 
/usr/include/ctype.h:   typedef unsigned int    wctype_t;

aix~$ grep wctrans_t /usr/include/*h | grep typedef 
/usr/include/wctype.h:typedef wint_t (*wctrans_t)();


solaris~$ grep wctype_t /usr/include/*h | grep typedef 
/usr/include/wchar.h:typedef    int     wctype_t;

solaris~$ grep wctrans_t /usr/include/*/*h | grep typedef
/usr/include/iso/wctype_iso.h:typedef unsigned int      wctrans_t;
like image 130
Cubbi Avatar answered Sep 20 '22 23:09

Cubbi