Are there any plans for adding versions of C standard library string processing functions that are invariant under current locale?
Currently there are lots of fragile workarounds, for example, from jansson/strconv.c:
static void to_locale(strbuffer_t *strbuffer)
{
const char *point;
char *pos;
point = localeconv()->decimal_point;
if(*point == '.') {
/* No conversion needed */
return;
}
pos = strchr(strbuffer->value, '.');
if(pos)
*pos = *point;
}
static void from_locale(char *buffer)
{
const char *point;
char *pos;
point = localeconv()->decimal_point;
if(*point == '.') {
/* No conversion needed */
return;
}
pos = strchr(buffer, *point);
if(pos)
*pos = '.';
}
These functions preprocess its input so it can be used independent of the current locale, under the assumption
setlocale
happens between these fix function and the call to any of the affected functions(1) implies that the preprocessing approach breaks on exotic locales (see https://en.wikipedia.org/wiki/Decimal_mark#Hindu.E2.80.93Arabic_numeral_system for examples). (2) implies that the preprocessing approach cannot be threadsafe without a lock, and that lock must be added to the C library. (3) Just stupid.
If it were only possible to specify the locale for a single call to a string-processing function as a parameter, not affecting any other threads, none of these restrictions would apply.
Questions:
Update:
After searching through the Internet, I found the *_l functions, available on FreeBSD, GNU/Linux and MacOSX. Similar functions exists on Windows also. These solve my problem, however these are not in POSIX, which is a superset of C (not really, POSIX relaxes on pointers). So questions 1, and 2 remains open.
BSD and macOS Sierra (and Mac OS X before it) support _l
functions that allow you to specify the locale, rather than relying on the current locale. For example:
int fprintf_l(FILE * restrict stream, locale_t loc, const char * restrict format, ...); int printf_l(locale_t loc, const char * restrict format, ...); int snprintf_l(char * restrict str, size_t size, locale_t loc, const char * restrict format, ...); int sprintf_l(char * restrict str, locale_t loc, const char * restrict format, ...);
and:
int fscanf_l(FILE * restrict stream, locale_t loc, const char * restrict format, ...); int scanf_l(locale_t loc, const char * restrict format, ...); int sscanf_l(const char * restrict str, locale_t loc, const char * restrict format, ...);
As a general design, this seems sensible. The type locale_t
is not part of Standard C but is part of POSIX (and defined in <locale.h>
there), and used in <ctype.h>
amongst other places. The BSD man pages say that the header to use is <xlocale.h>
rather than <locale.h>
; this would perhaps be fixed by the standard. Unless there is a major flaw in the design of the BSD functions, these should be a very good basis for any standardization effort, whether that was under POSIX or Standard C.
One issue with the BSD design might be that the locale_t
structure is passed by value, not by (constant restricted) pointer, which is a little surprising. However, it is consistent with the POSIX functions such as:
int isalpha_l(int, locale_t);
A similar scheme might be devised for handling time zone settings, too. There'd be more work in setting that up since there isn't already a time zone type (whereas the locale_t
is part of POSIX already — and could probably be adopted without change into standard C). But, combined with locale settings, it could make the time routines more easily usable in diverse environments from a single executable.
sqlite has locale independant printf implementation which is good for your sort of thing as it makes doubles compatible with sql syntax rules. If you can include sqlite as a dependency then that might be a viable option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With