Suppose I'm writing an article in HTML. The language of the article is Swedish, so I have <html lang="sv">
. Now I want to mark up the abbreviation properly in following text:
HTML kan användas till mycket.
To this end, I first do
<abbr title="HyperText Markup Language">HTML</abbr> kan användas till mycket.
This alone is not good enough, however, because the language of the title
attribute is Swedish (sv
). Besides being a theoretical problem, this will make screen readers pronounce the title in a highly awkward way. To remedy this, I could do
<abbr title="HyperText Markup Language" lang="en">HTML</abbr> kan användas
till mycket.
This is even worse, though, since now the abbreviation 'HTML' will be read in Enligsh instead of Swedish [so from a Swedish point of view, it will sound like "ejtsch-ti-emm-ell" instead of "hå-te-emm-ell"].
Hence, the abbreviation, or the text contents of the abbr
node, should be in Swedish, but the title
attribute should be in English. What is the preferred (HTML5) way of marking this up? Is it
<abbr title="HyperText Markup Language" lang="en">
<span lang="sv">HTML</span>
</abbr> kan användas till mycket.
?
The <abbr> HTML element represents an abbreviation or acronym.
The <abbr> tag in HTML is used to define the abbreviation or the short form of an element. The <abbr> and <acronym> tags are used as shortened versions and used to represent a series of letters. The abbreviation is used to provide useful information to the browsers, translation systems, and search-engines.
HTML 5 <abbr> TagThe HTML <abbr> tag is used for indicating an abbreviation. This tag is often used in conjunction with the global title attribute in order to provide an expansion of the abbreviation.
The title attribute specifies extra information about an element. The information is most often shown as a tooltip text when the mouse moves over the element. The title attribute can be used on any HTML element (it will validate on any HTML element. However, it is not necessarily useful).
Your conclusion is correct: In language markup in HTML, you cannot indicate the content of an element as being in a language other than its attribute values, since the lang
attribute sets both of them. And the workaround is the one you have found: use inner markup for the content. There’s no difference here between HTML 4 and HTML5.
However, this is a very theoretical issue.
First, the abbr
markup is almost useless in practice. Abbreviations should be explained, when needed, in normal text content, not in attributes. Speech browsers may optionally read title
attribute values, but in normal mode, they ignore them – people using speech browsers prefer fast reading and are often accustomed to rather high speech rates, and spelling out abbreviations would disturb this.
Second, “abbreviations” like “HTML” (which is really a proper name rather than anything else) should seldom be spelled out in speech. You wouldn’t want to hear speech like “The new version of HyperText Markup Language is HyperText Markup Language five, which has many extensions to HyperText Markup Language four.”
Third, language markup is largely write-only. In most situations, it is just ignored. Google does not care. Browsers may use it to decide on default font to be used, but most pages specify their own fonts, so the defaults don’t matter. Some speech browsers may recognize a few languages from lang
attributes, but most of them don’t: they read the content by the rules for the language selected by the user. Those that use language markup may make a distinction between British and US English, so if you still think language markup is relevant, consider using lang="en-GB"
in this context. (I’m assuming that most Swedish-speaking people would find Received Pronunciation more understandable and natural than Standard American, but I might be wrong.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With