Вопрос

My system has products, with images associated with them, like so:

class Product(models.Model):
    name = models.CharField(max_length=100)
    ...

class Image(models.Model):
    product = models.ForeignKey(Product)
    image = models.ImageField(upload_to='products')

So far so good. Naturally, the client wants to upload their products in bulk in a csv and upload a zip file containing the images. I format the csv as so:

product_name,image_1.jpg,image_2.jpg,...
product_2,image.jpg,...

So far I've made a model just as a helper:

class BulkUpload(models.Model):
    csv = models.FileField(upload_to='tmp')
    img_zip = models.FileField(upload_to='tmp')

The workflow goes something like this:

  1. User uploads files via the django admin
  2. Get the zip file contents and store for later
  3. Extract the zip into tmp directory
  4. Start a transaction. If anything unexpected happens from here we rollback
  5. For each row in the csv
    1. Create and save product with the name specified in the first column.
    2. Grab image filenames from the other csv fields
    3. Check the images are in the zip, otherwise rollback
    4. Check the images don't already exist in the destination directory, otherwise rollback
    5. Move the images to the destination directory and set the fk to the saved product object, rollback on any errors.
  6. Commit the transaction
  7. Delete the zip and csv, and delete the bulk upload object (or just don't save it)

If we roll back at any point we should somehow inform the user what went wrong.

My initial idea was to override save or use a post_save signal, but not having access to the request means I can neither use messages nor raise a validation error. Overriding model_save() in the admin has it's own problems, not being able to do any validation.

So now my thought is to change the ModelForm and give this to the django admin. I can override the clean() method, raise ValidationErrors and (presumably) run all my stuff in a transaction. But I'm struggling to figure out how I can access the files in such a way that I can use Python's ZipFile and csv libraries on them. It also feels a little dirty to do actual work in a form validation method, but I'm not sure where else I can do it.

I might have gone into too much detail, but I wanted to explain the solution so that alternative solutions can be suggested.

Это было полезно?

Решение

I don't think you should use a BulkUpload or any model representing this operation, at least if you plan on doing the process synchronously as you're currently suggesting. I would add an additional view to the admin area, either by hand or using a third party library, and there I would process the form and perform the workflow.

But anyways, given you already have your BulkUpload model, it's certainly easier to do it using a admin.ModelAdmin object. Your main concern seems to be where you should place the code of the transaction. As you've mentioned there are several alternatives. In my opinion, the best option would be to divide the process in two parts:

First, in your model's clean method you should check all the potential errors that may be produced by the user: images that already exist, missing images, duplicated products, etcetera. Here you should check that the uploaded files are OK, for instance using something like:

def clean(self):
    if not zipfile.is_zipfile(self.img_zip.file):
        raise ValidationError('Not a zip file')

After that, you know that any error that may arise from this point on will be produced by a system error: the bd failing, the HD not having enough space, etc. because all other possible errors should have been checked in the previous step. In your ModelAdmin.save_model method you should perform the rest of your workflow. You can inform the user of any errors using ModelAdmin.message_user.

As for the actual processing of the uploaded files, well, you named it: just use the zipfile and csv modules in the standard library. You should create a ZipFile object and extract it somewhere. Now, you should go over the data of your csv file using csv.reader. Something like this (not tested):

def save_model(self, request, obj, form, change):
    # ...
    with open('tmp/' + obj.img_zip.name, 'r') as csvfile:
            productreader = csv.reader(csvfile)
            for product_details in productreader:
                p = Product(name=product_details[0])
                p.save()

                for image in product_details[1:]:
                    i = ImageField()
                    i.product = p
                    i.image = File(open('tmp/' + image)) # not tested
                    i.save() 

After all this there would be no point in having a BulkUpload instance, so you should delete it. That's why I said in the beginning that that model is a little bit useless.

Obviously you'd need to add the code for the transactions and some other stuff, but I hope you get the general idea.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top