Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CSS: "text-transform: capitalize" and Italian, Spanish, Portuguese, French, etc

What should the effect of the CSS rule text-transform: capitalize be in the case where the text contained within the HTML tags is in Italian, Spanish, Portuguese, French, etc... (these languages AFAIK don't make use of capitalizing the first letter of every word as this is specific to English and not to these locales). I was thinking of the case where you have a multi-lingual site where you can select the language for the content to be extracted from the database and displayed on the pages from a drop-down menu in the upper-right hand corner. In this scenario, since the CSS should likely be the same and indpendent of the language, what should the effect be when the opening tag is, say,

< html lang="fr" >, < html lang="pt" >, < html lang="es" >, < html lang="it" >, etc...

instead of

< html lang="en" >

?

IMHO this should turn off the behavior of "text-transform: capitalize", please correct me if I'm wrong, or if this should be achieved in some other way, perhaps only by overriding the base CSS file with another CSS file for each supported language.

Thank you for your replies concerning this I18N issue with CSS.

like image 380
John Sonderson Avatar asked Dec 16 '22 02:12

John Sonderson


2 Answers

According to the CSS 2.1 definition, text-transform: capitalize “puts the first character of each word in uppercase; other characters are unaffected”.

This is vague without a rigorous definition of “word”, but for “uppercase”, the only feasible interpretation is that it is the uppercase mapping of a character by the Unicode standard.

The spec adds: “The actual transformation in each case is written language dependent.” The reasonable interpretation is that this refers to some language-dependent case mapping exception; e.g., in the Turkish language, “i” maps to “İ” (capital I with dot above), not to the common “I”.

For texts written in Latin letters, “word” is normally a maximal sequence of alphabetic characters, though it can be argued whether a hyphenated word like “tax-free” is two words or one. In any case, the principle is clearly that every word is capitalized. This makes the setting rather useless, since hardly any language has such rules. In English, when titles of works are capitalized, exceptions are made for articles and prepositions; but the CSS property does not know such rules.

The definition in CSS Text Module Level 3 (a Last Call Working Draft) is somewhat more explicit: “The definition of “word“ used for ‘capitalize’ is UA-dependent; [UAX29] is suggested (but not required) for determining such word boundaries. Authors should not expect ‘capitalize’ to follow language-specific titlecasing conventions (such as skipping articles in English).”

This also means that it is not intended to observe language-specific rules regarding the issue whether titles of works and comparable expressions should, in general, have words capitalized. Most languages have no such principles.

If you specify text-transform: capitalize, you are requiring that all words be capitalized, no matter what the language is, no matter what the context is, no matter what the words are. If you think this makes the setting rather useless, you drew the right conclusion.

Proper localization capitalizes words in actual context when needed.

like image 82
Jukka K. Korpela Avatar answered May 14 '23 01:05

Jukka K. Korpela


I can agree with you, that behavior of many of CSS directives should depend on value of lang attribute. However, the problem is it does not.

For this reason, using any of text-transform directives introduces I18n error to the page. There are (at least) reasons for it:

  • Text transformation rules depends strictly on language
  • In some languages the case of first letter brings additional information

In the prior case people usually give these three examples:

  • Turkish language two "i" characters: dotted i (uppercase version: İ) and dotless ı (uppercase version: I). As you can see, there is no way to convert it correctly without context
  • German letter sharp s: ß according to grammar rules should be capitalized to two letters SS. I haven't tested it for a while, but few months ago the only web browser to get it right was Chrome...
  • Greek letter sigma has one uppercase form Σ, but two lowercase forms: σ (regular) and ς (final). The use of either of two depends on their place in the word.

There are more examples of specific case transformation (especially in Greek), I just wanted to point out that you really need language information to transform case correctly.

In the latter case (additional meaning), I just want to point out, that in German nouns should start with a capital letter, but other words should not. It may actually alter the meaning of the sentence, if you just capitalize all the words in the sentence ("title-cased" them).

The moral of this story is simple: do not use text-transform directive. Leave it to translators, they will know what case to use.
Oh, and by the way, the same words may be translated in a different way depending on the context, so re-using the translations and "correcting" the case is really bad idea.

like image 24
Paweł Dyda Avatar answered May 13 '23 23:05

Paweł Dyda