Pergunta

Suppose I'm writing an article in HTML. The language of the article is Swedish, so I have <html lang="sv">. Now I want to mark up the abbreviation properly in following text:

HTML kan användas till mycket.

To this end, I first do

<abbr title="HyperText Markup Language">HTML</abbr> kan användas till mycket.

This alone is not good enough, however, because the language of the title attribute is Swedish (sv). Besides being a theoretical problem, this will make screen readers pronounce the title in a highly awkward way. To remedy this, I could do

<abbr title="HyperText Markup Language" lang="en">HTML</abbr> kan användas
  till mycket.

This is even worse, though, since now the abbreviation 'HTML' will be read in Enligsh instead of Swedish [so from a Swedish point of view, it will sound like "ejtsch-ti-emm-ell" instead of "hå-te-emm-ell"].

Hence, the abbreviation, or the text contents of the abbr node, should be in Swedish, but the title attribute should be in English. What is the preferred (HTML5) way of marking this up? Is it

<abbr title="HyperText Markup Language" lang="en">
  <span lang="sv">HTML</span>
</abbr> kan användas till mycket.

?

Foi útil?

Solução

Your conclusion is correct: In language markup in HTML, you cannot indicate the content of an element as being in a language other than its attribute values, since the lang attribute sets both of them. And the workaround is the one you have found: use inner markup for the content. There’s no difference here between HTML 4 and HTML5.

However, this is a very theoretical issue.

First, the abbr markup is almost useless in practice. Abbreviations should be explained, when needed, in normal text content, not in attributes. Speech browsers may optionally read title attribute values, but in normal mode, they ignore them – people using speech browsers prefer fast reading and are often accustomed to rather high speech rates, and spelling out abbreviations would disturb this.

Second, “abbreviations” like “HTML” (which is really a proper name rather than anything else) should seldom be spelled out in speech. You wouldn’t want to hear speech like “The new version of HyperText Markup Language is HyperText Markup Language five, which has many extensions to HyperText Markup Language four.”

Third, language markup is largely write-only. In most situations, it is just ignored. Google does not care. Browsers may use it to decide on default font to be used, but most pages specify their own fonts, so the defaults don’t matter. Some speech browsers may recognize a few languages from lang attributes, but most of them don’t: they read the content by the rules for the language selected by the user. Those that use language markup may make a distinction between British and US English, so if you still think language markup is relevant, consider using lang="en-GB" in this context. (I’m assuming that most Swedish-speaking people would find Received Pronunciation more understandable and natural than Standard American, but I might be wrong.)

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top