What's the right way to do a Django/South/PostgreSQL migration from calculated-as-needed summary values to maintained-in-database summary values?

StackOverflow https://stackoverflow.com/questions/16284137

Question

My Django app has a User model, those Users have many Transactions. Some of my views display a summary (summation) of all transaction amounts, let's call it the 'total'. So far, this has been tallied up when needed for display.

Now, I'd like to add this tally to essentially every page a User views... so I'd prefer it to come from a DB/model field, that's maintained with each new Transaction. I know how to do that: add a 'total' field to my User model, update it as needed (using the Django ORM F()-expressions for race-proof-ness). So far so good.

My question regards setting the initial 'total' value, tracking all Transactions so far (before the running-tally was implemented).

I suppose I could, during a maintenance window where no new Transactions arrive, do a data-migration initializing all User.total values to the current tally. However, I'd rather not do that: the last similar big data-migration I did took hours longer than expected.

Is there a recommended technique/trick for doing the catchup tallying without a long outage, while new transactions are also arriving?

I suppose I could write the catchup data-migration to consider only transactions before the threshold date (or id) at the moment the new, tally-maintaining code is deployed. (Then, I'd run the data-migration while the system is up, and only reveal the new tallies in the interface when the migration completes, no matter how long that takes.) However, I'd rather not code this date/id threshold into the migration source code. Is there South metadata that could be used for this purpose?

Was it helpful?

Solution

I'm afraid there is no "one size fits all" solution to the problem you described.

It seems to me that you have a good understanding of what should be done, so let me suggest one other possible solution.

Assuming that you have a large number of users and each user has a small or moderate amount of transactions (so that processing a single user's transactions doesn't take ages), you could do something like this in your South data migration (using the old Django transactions as you asked the question before Django 1.6 was out):

from django.db import transaction
for user in orm.User.objects.all():
    with transaction.commit_on_success():
        user._total = calculate_sum_of_transactions_for_user(user)
        user.transactions_migrated = True
        user.save()

Then you could add the following method to your User model:

@property
def total(self):
    if self.transactions_migrated:
        return self._total
    else:
        return calculate_sum_of_transactions_for_user(user)

And the transaction creation code could look like this:

class Transaction(models.Model):
    amount = models.DecimalField(...)

    def save(self, ...):
        super().save(...)
        if self.user.transactions_migrated:
            self.user._total = F('_total') + self.amount
            self.user.save()

You could even get rid of the transactions_migrated field and replace it with some _total is None check.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top