Django: allowing download of various media files from S3 without creating a model (while hiding s3 storage)

https://stackoverflow.com/questions/22960374

30-06-2023
|

Question

I have thousands of media files in S3.

The files could be mimetype plain_text, html, xml, pdf, binary, zip, etc
In addition, some files might also be gzipped

I would like to render these files in DJango app. I don't want to provide user direct access to S3. In some cases, I want to modify the file before rendering it.
eg:

/base/path/file_name_aaa.txt.gz <--- download from S3, unzip, and render preformatted text thru django
/base/path/file_name_aaa.pdf <--- download from S3 and render as PDF thru django
/base/path/file_name_bbb.pdf.gz <--- download from S3, unzip and render as PDF thru django
/base/path/file_name_ccc.xml.gz <--- download from S3, unzip, replace some content, and render as unzipped xml thru django

I got the first part of plain text:

from boto.s3.connection import S3Connection
import zlib

def get_gzipped_content(stream):
    content = ''
    for part in stream_decompress(stream):
        content += part
    return content

def stream_decompress(stream):
    '''
    decompress s3 gzipped stream
    http://stackoverflow.com/questions/12571913/python-unzipping-stream-of-bytes
    '''
    dec = zlib.decompressobj(16+zlib.MAX_WBITS)  # same as gzip module
    for chunk in stream:
        rv = dec.decompress(chunk)
        if rv:
            yield rv
conn = S3Connection(aws_key, aws_secret)
fname = 'aaa/bbb/ccc_1234.txt.gz'
key = conn.get_bucket('my_bucket').get_key(fname)
if fname.lower().endswith('.gz'):
    content = get_gzipped_content(key)
else:
    content = key.get_contents_as_string()
(render content as string in django)

I would appreciate help in getting other mime types/gzip

Solution 2

You could use a standard mimetype module for determine content type and encoding by filename, eg:

In [1]: import mimetypes

In [2]: mimetypes.guess_type('hello.txt.gz')
Out[2]: ('text/plain', 'gzip')

In [3]: mimetypes.guess_type('hello.pdf.gz')
Out[3]: ('application/pdf', 'gzip')

In [4]: mimetypes.guess_type('hello.pdf')
Out[4]: ('application/pdf', None)

https://docs.python.org/2/library/mimetypes.html

OTHER TIPS

In addition to what kubus added, I was also trying to figure out how to force "render" in browser vs "download" file.

response = HttpResponse(ContentFile(content), content_type=mimetypes.guess_type(attach_id)[0])
if <this file should be forced download, and not render in browser>:
    response['Content-Disposition'] = "attachment; filename=%s" % filename 
# else, it will try to render in browser.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow