Question

Lets say I have the following Django model:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)

Each label has an ID number, the label text, and an abbreviation. Now, I want to have these labels translatable into other languages. What is the best way to do this?

As I see it, I have a few options:

1: Add the translations as fields on the model:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    label_english = models.CharField(max_length=255)
    abbreviation_english = models.CharField(max_length=255)
    label_spanish = models.CharField(max_length=255)
    abbreviation_spanish = models.CharField(max_length=255)

This is obviously not ideal - adding languages requires editing the model, the correct field name depends on the language.

2: Add the language as a foreign key:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)
    language = models.ForeignKey('languages.Language')

This is much better, now I can ask for all labels with a certain language, and throw them into a dict:

labels = StandardLabel.objects.filter(language=1)
labels = dict((x.pk, x) for x in labels)

But the problem here is that the labels dict is meant to be a lookup table, like so:

x = OtherObjectWithAReferenceToTheseLabels.object.get(pk=3)
thelabel = labels[x.labelIdNumber].label

Which doesn't work if there is a row per label, possibly with multiple languages for a single label. To solve that one, I need another field:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    group_id = models.IntegerField(db_index=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)
    language = models.ForeignKey('languages.Language')
    class Meta:
        unique_together=(("group_id", "language"),)
#and I need to group them differently:
labels = StandardLabel.objects.filter(language=1)
labels = dict((x.group_id, x) for x in labels)

3: Throw label text out into a new model:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    text = models.ManyToManyField('LabelText')

class LabelText(models.Model):
    id = models.AutoField(primary_key=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)
    language = models.ForeignKey('languages.Language')

labels = StandardLabel.objects.filter(text__language=1)
labels = dict((x.pk, x) for x in labels)

But then this doesn't work, and causes a database hit every time I reference the label's text:

x = OtherObjectWithAReferenceToTheseLabels.object.get(pk=3)
thelabel = labels[x.labelIdNumber].text.get(language=1)

I've implemented option 2, but I find it very ugly - i don't like the group_id field, and I can't think of anything better to name it. In addition, StandardLabel as i'm using it is an abstract model, which I subclass to get different label sets for different fields.

I suppose that if option 3 /didn't/ hit the database, it's what I'd choose. I believe the real problem is that the filter text__language=1 doesn't cache the LabelText instances, and so the DB is hit when I text.get(language=1)

What are your thoughts on this? Can anyone recommend a cleaner solution?

Edit: Just to make it clear, these are not form labels, so the Django Internationalization system doesn't help.

Was it helpful?

Solution

I'd much prefer to add a field per language than a new model instance per language. It does require schema alteration when you add a new language, but that isn't hard, and how often do you expect to add languages? In the meantime, it'll give you better database performance (no added joins or indexes) and you don't have to muck up your query logic with translation stuff; keep it all in the templates where it belongs.

Even better, use a reusable app like django-transmeta or django-modeltranslation that makes this stupid simple and almost completely transparent.

OTHER TIPS

Another option you might consider, depending on your application design of course, is to make use of Django's internationalization features. The approach they use is quite common to the approach found in desktop software.

I see the question was edited to add a reference to Django internationalization, so you do know about it, but the intl features in Django apply to much more than just Forms; it touchs quite a lot, and needs only a few tweaks to your app design.

Their docs are here: http://docs.djangoproject.com/en/dev/topics/i18n/#topics-i18n

The idea is that you define your model as if there was only one language. In other words, make no reference to language at all, and put only, say, English in the model.

So:

class StandardLabel(models.Model):
    abbreviation = models.CharField(max_length=255)
    label = models.CharField(max_length=255)

I know this looks like you've totally thrown out the language issue, but you've actually just relocated it. Instead of the language being in your data model, you've pushed it to the view.

The django internationalization features allow you to generate text translation files, and provides a number of features for pulling text out of the system into files. This is actually quite useful because it allows you to send plain files to your translator, which makes their job easier. Adding a new language is as easy as getting the file translated into a new language.

The translation files define the label from the database, and a translation for that language. There are functions for handling the language translation dynamically at run time for models, admin views, javascript, and templates.

For example, in a template, you might do something like:

<b>Hello {% trans "Here's the string in english" %}</b>

Or in view code, you could do:

# See docs on setting language, or getting Django to auto-set language
s = StandardLabel.objects.get(id=1)
lang_specific_label = ugettext(s.label)

Of course, if your app is all about entering new languages on the fly, then this approach may not work for you. Still, have a look at the Internationalization project as you may either be able to use it "as is", or be inspired to a django-appropriate solution that does work for your domain.

I would keep things as simple as possible. The lookup will be faster and the code cleaner with something like this:

class StandardLabel(models.Model):
    abbreviation = models.CharField(max_length=255)
    label = models.CharField(max_length=255)
    language = models.CharField(max_length=2)
    # or, alternately, specify language as a foreign key:
    #language = models.ForeignKey(Language)

    class Meta:
        unique_together = ('language', 'abbreviation')

Then query based on abbreviation and language:

l = StandardLabel.objects.get(language='en', abbreviation='suite')

Although I would go with Daniel's solution, here is an alternative from what I've understood from your comments:

You can use an XMLField or JSONField to store your language/translation pairs. This would allow your objects referencing your labels to use a single id for all translations. And then you can have a custom manager method to call a specific translation:

Label.objects.get_by_language('ru', **kwargs)

Or a slightly cleaner and slightly more complicated solution that plays well with admin would be to denormalize the XMLField to another model with many-to-one relationship to the Label model. Same API, but instead of parsing XML it could query related models.

For both suggestions there's a single object where users of a label will point to.

I wouldn't worry about the queries too much, Django caches queries and your DBMS would probably have superior caching there as well.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top