Question

I have a Django website with a PostgreSQL database. There is a Django app and model for a 'flat' item table with many records being inserted regularly, up to millions of inserts per month. I would like to use these records to automatically populate a star schema of fact and dimension tables (initially also modeled in the Django models.py), in order to efficiently do complex queries on the records, and present data from them on the Django site.

Two main options keep coming up:

1) PostgreSQL Triggers: Configure the database directly to insert the appropriate rows into fact and dimensional tables, based on creation or update of a record, possibly using Python/PL-pgsql and row-level after triggers. Pros: Works with inputs outside Django; might be expected to be more efficient. Cons: Splits business logic to another location; triggering inserts may not be expected by other input sources.

2) Django Signals: Use the Signals feature to do the inserts upon creation or update of a record, with the built-in signal django.db.models.signals.post_save. Pros: easier to build and maintain. Cons: Have to repeat some code or stay inside the Django site/app environment to support new input sources.

Am I correct in thinking that Django's built-in signals are the way to go for maintaining the fact table and the dimension tables? Or is there some other, significant option that is being missed?

Was it helpful?

Solution

I ended up using Django Signals. With a flat table "item_record" containing fields "item" and "description", the code in models.py looks like this:

from django.db.models.signals import post_save

def create_item_record_history(instance, created, **kwargs):
    if created:
        ItemRecordHistory.objects.create(
            title=instance.title, 
            description=instance.description, 
            created_at=instance.created_at,
            )
post_save.connect(create_item_record_history, sender=ItemRecord)

It is running well for my purposes. Although it's just creating an annotated flat table (new field "created_at"), the same method could be used to build out a star schema.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top