Question

Lets say I'm designing a website where we have English, French, Spanish, German, and Korea (I'm not, but lets pretend I am).

I cannot rely upon services such as google translate, as the nature of the website is not for entertainment but business. Lets say I have access to professional translators that can translate something in context to another language and give me that text.

What are some known and simple ways to serve up content over multiple languages with a website?

There's lots of options, such as separate pages, using a database, and so forth... but I can't really decide what's best, how the concept would scale, what needs to be considered, and how to deal with missing translations?

Are there any well-established practices for this?

Was it helpful?

Solution

The broad topic you're asking about is called "Internationalization and Localization" (or I18N and L10N for short). The important thing to remember is that it's not just about translating messages. There are many other things that go into internationalizing a website.

The more obvious things you will need are:

  • A character encoding that works for characters in all languages, not just English (This means everything down to the database should use UTF encoding)
  • Some way of representing the user's Locale (ie: Java's Locale class)
  • A common pattern for producing a message in that user's locale (ie: Spring's MessageSource

Other things you need to consider:

  • Properly sorting strings based on Locale
  • Formatting date based on locale
  • Always showing times in the user's time zone
  • Showing distance measurements for the user's locale (ie: Miles versus Kilometers?)
  • Laying out the website in right-to-left for languages like Hebrew
  • Think about how you pluralize your messages. String message = "Please fix the following error" + (errors.size() > 1 ? "s" : ""); just doesn't work in an internationalized program.
  • Think about how to lay out your web pages when the length of text may vary wildly.. and never assume that a character is more-or-less a certain width (a single character in Kanji might be 8 times wider than a lower case 'i')

The best resource I can find for this is the ICU library's User guide. If you use Java, this is the library to use.

Hopefully this answer is a helpful start!

OTHER TIPS

Take a look at google's recommendations for Multi-regional and multilingual sites. Hope that information comes in handy. Best of luck.

Totally agree with @Michael D and other developers who posted their answers. Although the for this question is already accepted but I think one small option such as the :lang() pseudo class can be helpfull for createing multilingual sites as well. The :lang() pseudo class allows to determine the language in various documents.

CSS code:


    q:lang(fr) {     /* Quotations for French */ 
     quotes: "\00AB" "\00BB"; 
    }

    q:lang(en) {
     quotes: "\201C" "\201D";    /* Quotations for English */ 
    }


HTML code:

 <html>
 <body>
 <pre>
     <p>Quote in French: <q lang="fr">То être ou ne pas être</q>.</p>
     <p>Quote in English: <q lang="en">То be or not to be</q>.</p>
 </pre>
 </body>
 </html>

And the output will be like this:

Quote in French language: << То être ou ne pas être >>.

Quote in English language: "То be or not to be".

Please note that we are talking about documents, not a piece of text, as they perform complex formatting.

We have a set of files on disk that contain all the strings in a given widget/module/whatever, and separate files per language, i.e.:

foo.strings == generic (happens to be US english)
foo.fr.strings == french
foo.fr-CA.strings == canadian french 
foo.en-CA.strings == canadian english

Based on the client's Accept-Language header, we determine which language he wants.

When a given language is first requested, we hit the file system to build up the big string mapping for that language, then cache it in memory. If a given string isn't defined in fr-CA, we'll hop up the stack to fr, then eventually to the generic

Pages are generated dynamically and the generated version of each url is cached depending on the user's language headers (among other things).

Hope that helps

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top