Your conclusion is correct: In language markup in HTML, you cannot indicate the content of an element as being in a language other than its attribute values, since the lang
attribute sets both of them. And the workaround is the one you have found: use inner markup for the content. There’s no difference here between HTML 4 and HTML5.
However, this is a very theoretical issue.
First, the abbr
markup is almost useless in practice. Abbreviations should be explained, when needed, in normal text content, not in attributes. Speech browsers may optionally read title
attribute values, but in normal mode, they ignore them – people using speech browsers prefer fast reading and are often accustomed to rather high speech rates, and spelling out abbreviations would disturb this.
Second, “abbreviations” like “HTML” (which is really a proper name rather than anything else) should seldom be spelled out in speech. You wouldn’t want to hear speech like “The new version of HyperText Markup Language is HyperText Markup Language five, which has many extensions to HyperText Markup Language four.”
Third, language markup is largely write-only. In most situations, it is just ignored. Google does not care. Browsers may use it to decide on default font to be used, but most pages specify their own fonts, so the defaults don’t matter. Some speech browsers may recognize a few languages from lang
attributes, but most of them don’t: they read the content by the rules for the language selected by the user. Those that use language markup may make a distinction between British and US English, so if you still think language markup is relevant, consider using lang="en-GB"
in this context. (I’m assuming that most Swedish-speaking people would find Received Pronunciation more understandable and natural than Standard American, but I might be wrong.)