Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does glibc get its database of unicode attributes? [closed]

Where does glibc get its database of unicode attributes, for such functions as eg, wcwidth()? I'm interested in correcting a few errant entries, but I can't seem to find where this information is in its source distribution.

If it matters, I'm primarily interested in this under debian or ubuntu linux.

like image 834
bdonlan Avatar asked May 05 '09 01:05

bdonlan


2 Answers

It looks like the data is generated by the (apparently manually-run) localedata/gen-unicode-ctype.c from the unicode datafiles published at http://unicode.org/Public/UNIDATA/ . Thanks to Naaff for pointing me in the right direction!

like image 68
bdonlan Avatar answered Nov 16 '22 14:11

bdonlan


Okay, so I'm just poking around myself so I'm not absolutely sure, but it appears that the table you are looking for is found in the following location relative to the glibc root:

localedata/locales/i18n

This appears to be the Unicode (version 5) locale. It contains the following, which is where I believe you need to make your changes:

% ENCLOSED ALPHANUMERICS/
   <U24D0>..<U24E9>;/

In case you're wondering, the function ctype_output (ld-ctype.c) calls allocate_arrays which calls wcwidth_table_init. The function wcwidth_table_init is generated by 3level.h (which also generates other tables that follow the same template). This is the chain that I followed to track down the files in localedate/locales.

Like I said, I'm not 100% sure that this is the right table, but I thought I'd share what I had found.

like image 1
Naaff Avatar answered Nov 16 '22 13:11

Naaff