Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Double Underscore in PHP, Wordpress, phpMyAdmin, C, i18n, L10n etc?

To quote another question that ended me here.

What does the double underscores in these lines of PHP code mean?

$WPLD_Trans['Yes']=__('Yes',$WPLD_Domain);
$WPLD_Trans['No']=__('No',$WPLD_Domain);

and related question about the usage of __(), _() etc. in Wordpress etc.

Started as an answer to the one mentioned above, and other related. But post as own question (with answer) as it became a bit more detailed.

Please feel free to edit and improve – or enter better Answer / Question.

like image 819
Runium Avatar asked Jan 11 '13 01:01

Runium


1 Answers

The usage of __(…)

Double underscore is used by various implementations. Wordpress (WP) is mentioned, phpMyAdmin (PMA) is another, and so on.

It reflects on the native PHP function gettext(), which in PHP has an alias as a single underscore: _(). This is not unique to PHP. Gettext is natively a long-standing GNU project for writing multilingual programs on Unix like systems. We find the _() alias in various programming languages.

Reflection on the subject

In ones humble/honest opinion, one could say, _() as native and __() as custom implementations of gettext, or other locale specific uses, is the only valid ones - as it is a well established convention. On the other hand it is not good, but even so concise. E.g. in ANSI-C standard double underscore __ is reserved for the compiler's internal use; and as PHP is as closely related to C as it is, they have reserved it, and in general it is not the best thing to mix up even across languages (IMHO), it is on the border/thin ice/etc.

(One issue here is the choice of UnderscoreJS to use _ as their foundation on something that has nothing to do with language support.) Mix it in with prototype.js's $ and $$ and there is might some understanding why someone scratches their head when seeing __() – especially taking into account how close PHP and JS are in implementation of a working system.


Locale support

On this step I make some notes on is the usage: Locale support with weight on i18n and L10n.

It is often used, as with WP, 2, 3 and (the well documented) PMA, by giving language support. E.g.:

__('Error'); would yield Feil in Norwegian or Fehler in German. But, one important thing to take note of here is that, in the same realm, we have more then direct translation, but also other quirks one find in differing locales/cultures/languages.

E.g. (to specify):

Some languages uses comma – , – as fractional separator, others use dot – . – for the same purpose. Other typical locale variations is centimetre vs inches, date format, 12 vs 24 hour clock, currency etc. So:

       NUMBER    LEN        TIME
LANG1: 3,133.44  2.00 inch   5:00 PM
LANG2: 3.133,44  5,08 cm    17:00

Next layer of code you unveil would probably have some reference to the rather cryptic conventions like i18n or l10n in variable names, functions and file-names.

Once you know what they mean they are quite handy, tho definitions can be somewhat blurry. In general we have:

i18n: Internationalization
l10n: Localization (Often L10n)
g11n: Globalization (Used by e.g. IBM and Sun Microsystems)
l12y: Localizability (Microsoft)
m17n: Multilingualization (Continuum between internationalization and localization)
...   And so on.

aich, my head hurts.

The number refers to number of letters in the words between first and last letter. The letter before and after – is the first and last letter in the word. Phew. So:

       i18n                 l10n
internationalization    localization
 |                |      |        |
 +-- 18 letters --+      +-- 10 --+

i       18         n    l    10    n

(We also have more powerful numeronyms that does not follow this convention like G8 (Group of Eight) -, but then we're outside the realm of programming - or are we?).

Then to the definitions:

W3C has a rather sane definition of Internationalization (i18n), and Localization (l10n).

  • Internationalization, (i18n), is the design and development of a product, application or document content that enables easy localization for target audiences that vary in culture, region, or language.

  • Localization, (l10n), refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market (a locale).

The short, and concise, version from debian on the subject is:

  • Internationalization (I18N): To make a software potentially handle multiple locales.
  • Localization (L10N): To make a software handle an specific locale.

The use of domains

Frequently, when looking at code, reading about, etc. localization one come across the usage of domain. A more search engine friendly-name would perhaps be Translation Domain.

(Here I'm a bit on thin ice.).

In short one could define this as context. As a base you have a locale. E.g. Scotland. But you can further define a domain as part of a theme, profession etc. E.g. would Java in programming be rather different from a coffee shop. Or cow on a degenerated site about women vs. farming.

A page giving some clue is repoze.org with:

  • Translation Domain
  • Translation Directory

A translation directory would typically be:

/path/to/your/translation_root/en_US/LC_MESSAGES/
/path/to/your/translation_root/en_GB/LC_MESSAGES/
/path/to/your/translation_root/nn_NO/LC_MESSAGES/
…

Files used with GetText

If you take a look at e.g. PMA you will find a directory called locale. It contains various directories for various languages. The files are with .mo extension. This is GetText specific files. One typically start with PO, Portable Object, files which are plain text files. These are indexed and compiled to MO, Machine Object, files which are indexed for optimization. More about this here (Same as the PMA-link above).


like image 71
Runium Avatar answered Oct 12 '22 23:10

Runium