Question

We have a tagging system for users to filter their files by defined tags.

Here's how the models are set up:

class Tags(models.Model):
    name = models.CharField(max_length=100)
    user = models.ForeignKey(User)

class Files(models.Model):
    user = models.ForeignKey(User)
    name = models.CharField(max_length=100)
    tags = models.ManyToManyField(Tags, null=True, blank=True)

Now, because tags are not required, when we remove tags from a file they don't get deleted. This leaves a bunch of tags saved on our database and we want to clean them up.

I've tried redefining the save method on the Files model, and the clean method.

I've tried connecting an m2m_changed signal on the Files model: https://docs.djangoproject.com/en/dev/ref/signals/#m2m-changed

Last thing I tried was a pre_save signal: https://docs.djangoproject.com/en/dev/ref/signals/#pre-save

I was planning to iterate over the tags and delete the ones with empty files_set, but using these methods I can't reliably figure that out (i.e. I end up removing tags that aren't associated but are about to be associated (because m2m_changed fires several times with different actions)).

Here's what I thought would work:

def handle_tags (sender, instance, *args, **kwargs) :
    action = kwargs.get('action')
    if action == 'post_clear':
        # search through users tags... I guess?
        tags = Tags.objects.filter(user=instance.user)
        for tag in tags:
            if not tag.files_set.exists():
                tag.delete()
    return

m2m_changed.connect(handle_tags, sender=Files.tags.through)

But, as I said, it will delete a tag before it is added (and if it's added, we obviously don't want to delete it).

Was it helpful?

Solution

You we're on the right track when using the m2m_changed signal.

Your problem is that when responding to the post_clear signal the tags have already been deleted so you won't be able to access them like that.

You actually need to dispatch your method before the tags are deleted, which means handling the pre_clear signal.

Something like this:

@receiver(m2m_changed, sender=Files.tags.through)
def handle_tags(sender, **kwargs):

    action = kwargs['action']

    if action == "pre_clear":
        tags_pk_set = kwargs['instance'].tags.values_list('pk')
    elif action == "pre_remove":
        tags_pk_set = kwargs.get('pk_set')
    else:
        return

    # I'm using Count() just so I don't have to iterate over the tag objects
    annotated_tags = Tags.objects.annotate(n_files=Count('files'))
    unreferenced = annotated_tags.filter(pk__in=tags_pk_set).filter(n_files=1)
    unreferenced.delete()

I've also added the handling of the pre_remove signal in which you can use the pk_set argument to get the actual tags that will be removed.

UPDATE

Of course the previous listener won't delete the unreferenced tags when deleting the files, since it's only handling the pre_clear and pre_remove signals from the Tags model. In order to do what you want, you should also handle the pre_delete signal of the Files model.

In the code below I've added an utility function remove_tags_if_orphan, a slightly modified version of handle_tags and a new handler called handle_file_deletion to remove the tags which will become unreferenced once the File is deleted.

def remove_tags_if_orphan(tags_pk_set):
    """Removes tags in tags_pk_set if they're associated with only 1 File."""

    annotated_tags = Tags.objects.annotate(n_files=Count('files'))
    unreferenced = annotated_tags.filter(pk__in=tags_pk_set).filter(n_files=1)
    unreferenced.delete()


# This will clean unassociated Tags when clearing or removing Tags from a File
@receiver(m2m_changed, sender=Files.tags.through)
def handle_tags(sender, **kwargs):
    action = kwargs['action']
    if action == "pre_clear":
        tags_pk_set = kwargs['instance'].tags.values_list('pk')
    elif action == "pre_remove":
        tags_pk_set = kwargs.get('pk_set')
    else:
        return
    remove_tags_if_orphan(tags_pk_set)


# This will clean unassociated Tags when deleting/bulk-deleting File objects
@receiver(pre_delete, sender=Files)
def handle_file_deletion(sender, **kwargs):
    associated_tags = kwargs['instance'].tags.values_list('pk')
    remove_tags_if_orphan(associated_tags)

Hope this clears things up.

OTHER TIPS

Just to sum up with hopefully a cleaner answer:

from django.db.models.signals import m2m_changed
from django.dispatch import receiver

class Tags(models.Model):
    name = models.CharField(max_length=100)
    user = models.ForeignKey(User)

class Files(models.Model):
    user = models.ForeignKey(User)
    name = models.CharField(max_length=100)
    tags = models.ManyToManyField(Tags, null=True, blank=True)


@receiver(m2m_changed, sender=Files.tags.through)
def delete_orphean_dateranges(sender, **kwargs):
    if kwargs['action'] == 'post_remove':
        Tags.objects.filter(pk__in=kwargs['pk_set'], files_set=None).delete()

post_remove ensure that the callback is fired when a Tag was disassociated from a File

I think you go deeper than it required. Just define related_name for Tag, and process post_save signal from File.

class Files(models.Model):
    user = models.ForeignKey(User)
    name = models.CharField(max_length=100)
    tags = models.ManyToManyField(Tags, null=True, blank=True, related_name='files')


def clean_empty_tags(sender, instance, *args, **kwargs):
     Tags.objects.filter(user=instance.user, files=None).delete()

post_save.connect(clean_empty_tags, sender=Files)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top