Question

My goal is to limit a FileField on a Django ModelForm to PDFs and Word Documents. The answers I have googled all deal with creating a separate file handler, but I am not sure how to do so in the context of a ModelForm. Is there a setting in settings.py I may use to limit upload file types?

Was it helpful?

Solution

I handle this by using a clean_[your_field] method on a ModelForm. You could set a list of acceptable file extensions in settings.py to check against in your clean method, but there's nothing built-into settings.py to limit upload types.

Django-Filebrowser, for example, takes the approach of creating a list of acceptable file extensions in settings.py.

Hope that helps you out.

OTHER TIPS

Create a validation method like:

def validate_file_extension(value):
    if not value.name.endswith('.pdf'):
        raise ValidationError(u'Error message')

and include it on the FileField validators like this:

actual_file = models.FileField(upload_to='uploaded_files', validators=[validate_file_extension])

Also, instead of manually setting which extensions your model allows, you should create a list on your setting.py and iterate over it.

Edit

To filter for multiple files:

def validate_file_extension(value):
  import os
  ext = os.path.splitext(value.name)[1]
  valid_extensions = ['.pdf','.doc','.docx']
  if not ext in valid_extensions:
    raise ValidationError(u'File not supported!')

Validating with the extension of a file name is not a consistent way. For example I can rename a picture.jpg into a picture.pdf and the validation won't raise an error.

A better approach is to check the content_type of a file.

Validation Method

def validate_file_extension(value):
    if value.file.content_type != 'application/pdf':
        raise ValidationError(u'Error message')

Usage

actual_file = models.FileField(upload_to='uploaded_files', validators=[validate_file_extension])

An easier way of doing it is as below in your Form

file = forms.FileField(widget=forms.FileInput(attrs={'accept':'application/pdf'}))

For a more generic use, I wrote a small class ExtensionValidator that extends Django's built-in RegexValidator. It accepts single or multiple extensions, as well as an optional custom error message.

class ExtensionValidator(RegexValidator):
    def __init__(self, extensions, message=None):
        if not hasattr(extensions, '__iter__'):
            extensions = [extensions]
        regex = '\.(%s)$' % '|'.join(extensions)
        if message is None:
            message = 'File type not supported. Accepted types are: %s.' % ', '.join(extensions)
        super(ExtensionValidator, self).__init__(regex, message)

    def __call__(self, value):
        super(ExtensionValidator, self).__call__(value.name)

Now you can define a validator inline with the field, e.g.:

my_file = models.FileField('My file', validators=[ExtensionValidator(['pdf', 'doc', 'docx'])])

I use something along these lines (note, "pip install filemagic" is required for this...):

import magic
def validate_mime_type(value):
    supported_types=['application/pdf',]
    with magic.Magic(flags=magic.MAGIC_MIME_TYPE) as m:
        mime_type=m.id_buffer(value.file.read(1024))
        value.file.seek(0)
    if mime_type not in supported_types:
        raise ValidationError(u'Unsupported file type.')

You could probably also incorporate the previous examples into this - for example also check the extension/uploaded type (which might be faster as a primary check than magic.) This still isn't foolproof - but it's better, since it relies more on data in the file, rather than browser provided headers.

Note: This is a validator function that you'd want to add to the list of validators for the FileField model.

Django since 1.11 has a FileExtensionValidator for this purpose:

class SomeDocument(Model):
    document = models.FileFiled(validators=[
        FileExtensionValidator(allowed_extensions=['pdf', 'doc'])])

As @savp mentioned, you will also want to customize the widget so that users can't select inappropriate files in the first place:

class SomeDocumentForm(ModelForm):
    class Meta:
        model = SomeDocument
        widgets = {'document': FileInput(attrs={'accept': 'application/pdf,application/msword'})}
        fields = '__all__'

You may need to fiddle with accept to figure out exactly what MIME types are needed for your purposes.

As others have mentioned, none of this will prevent someone from renaming badstuff.exe to innocent.pdf and uploading it through your form—you will still need to handle the uploaded file safely. Something like the python-magic library can help you determine the actual file type once you have the contents.

I find that the best way to check the type of a file by checking the content type. I will also add that one the best place to do the type checking is in form validation. I would have a form and a validation as follows:

class UploadFileForm(forms.Form):
    file = forms.FileField()

    def clean_file(self):
        data = self.cleaned_data['file']

        # check if the content type is what we expect
        content_type = data.content_type
        if content_type == 'application/pdf':
            return data
        else:
            raise ValidationError(_('Invalid content type'))

The following documentation links can be helpful: https://docs.djangoproject.com/en/3.1/ref/files/uploads/ and https://docs.djangoproject.com/en/3.1/ref/forms/validation/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top