Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

setlocale() for LC_MESSAGES to non-existing locale fails

For an embedded software project, I am adding support for translations and since we are running an embedded Linux I went for using libc gettext(). We do not have any locale definitions whatsoever installed, so I only try to set the LC_MESSAGES locale to my desired locale:

setlocale(LC_MESSAGES, "fake");

(I am using the name fake with a fake.mo file to do a pseudo-translation before I get my hands on proper translations).

This works fine when linked statically, it returns a locale handle, bindtextdomain() and friends all work fine and I get my "translated" string out of it:

setlocale() returned "fake"
current textdomain is "ewe"
current base directory is "/opt/btech/probe/share/locale/WA"
current LC_MESSAGES locale is "fake"
gettext("Error") ==> "Ḗřřǿř"

Now, when I compile this dynamically, it doesn't work. Neither on the target device, nor locally on my PC (with the files installed the same way). The setlocale() call fails, returning a NULL pointer and setting errno to ENOENT (file not found). At the point of setlocale() I haven't pointed bindtextdomain() to where my files are at, but switching the calls around doesn't help.

Am I doing things wrong, is my working example from above wrong and shouldn't really work? Do I need locale definitions for anything I call setlocale() on, even for LC_MESSAGES?

This is the source of the test binary:

#include <libintl.h>
#include <locale.h>
#include <stdio.h>

int main()
{
    const char *l = setlocale(LC_MESSAGES, "fake");

    printf("setlocale() returned \"%s\"\n", l);

    bind_textdomain_codeset("ewe", "UTF-8");
    bindtextdomain("ewe", "/opt/btech/probe/share/locale/WA");
    textdomain("ewe");

    printf("current textdomain is \"%s\"\n", textdomain(NULL));
    printf("current base directory is \"%s\"\n", bindtextdomain(textdomain(NULL), NULL));
    printf("current LC_MESSAGES locale is \"%s\"\n", setlocale(LC_MESSAGES, NULL));
    printf("gettext(\"Error\") ==> \"%s\"\n", gettext("Error"));

    return 0;
}

This is the output when compiled dynamically (either for target or host):

setlocale() returned "(null)"
current textdomain is "ewe"
current base directory is "/opt/btech/probe/share/locale/WA"
current LC_MESSAGES locale is "C"
gettext("Error") ==> "Error"

EDIT: Compiling the test binary as static on my host (x64 Linux) also makes it work, so there is something special with the static compile.

Additional question: Can I force gettext to load a specific mo file directly? Basically I would like to have a replacement for bindtextdomain() that takes a file name argument instead.

EDIT 2: So, I eventually found this post saying that I can get gettext() to load any translation as long as I have a valid setlocale() call first. So, my current workaround is to actually generate a /usr/lib/locale/locale-archive containing only the en_US locale, calling setlocale(LC_MESSAGES, "en_US"); setenv("LANGUAGE", "fake");, which ends up loading the correct message catalog. Still feels like an ugly workaround, and I still don't understand why the static link works without it.

like image 918
nafmo Avatar asked Nov 08 '22 16:11

nafmo


1 Answers

I had a similar problem (an embedded system where I don't have much control on the root file system) and I found this workaround to work:

  • Put all translations in /<mydir>/lang/
  • The SYS_LC_MESSAGES can be generated with localedef. I made one from the C locale on my system and copied it to each target directory

    mkdir output
    localedef -f UTF-8 -i /usr/share/i18n/locales/C  output/mylocale
    cp output/mylocale/LC_MESSAGES/SYS_LC_MESSAGES <mydir>/lang/ENG/LC_MESSAGES/
    
  • Set the LOCPATH to <mydir>
  • Debug things with strace to see which files it is trying to open.

This is my final file layout:

<mydir>
├── lang
│   ├── ENG
│   │   └── LC_MESSAGES
│   │       ├── SYS_LC_MESSAGES
│   │       └── mac.mo
│   ├── FRE
│   │   └── LC_MESSAGES
│   │       ├── SYS_LC_MESSAGES
│   │       └── mac.mo
│   ├── GER
│   │   └── LC_MESSAGES
│   │       ├── SYS_LC_MESSAGES
│   │       └── mac.mo
│   ├── ITA
│   │   └── LC_MESSAGES
│   │       ├── SYS_LC_MESSAGES
│   │       └── mac.mo
│   ├── SPA
│   │   └── LC_MESSAGES
│   │       ├── SYS_LC_MESSAGES
│   │       └── mac.mo

This is the code snippet:

putenv("LOCPATH=/<mydir>/lang");
setlocale(LC_ALL, "");  
setlocale(LC_MESSAGES, "ENG");
bindtextdomain("mac", "/<mydir>/lang");
textdomain("mac");      
gettext("Hello world");  

This is a hack, the proper solution would be to generate the locales correctly.

like image 174
Matteo Nardi Avatar answered Nov 15 '22 04:11

Matteo Nardi