Question

In the simple case of proxying a download from S3 to the client, I'm having trouble dealing with client disconnections mid-download.

val enumerator = Enumerator.outputStream{ out =>

    val s3Object = s3Client.getObject(new GetObjectRequest(bucket, path))
    val in = new BufferedInputStream(s3Object.getObjectContent())
    val bufferedOut = new BufferedOutputStream(out)

    Iterator.continually(in.read).takeWhile(_ != -1).foreach(bufferedOut.write)

    in.close()
    bufferedOut.close()
}

Ok.chunked(enumerator.andThen(Enumerator.eof).withHeaders(
    "Content-Disposition" -> s"attachment; filename=${name}"
)

This works (mostly) beautifully as long as the client doesn't cancel the download before it completes. Otherwise, on cancellation the enumerator keeps filling up with data until the download from S3 is complete. Several cancelled downloads can hog quite a bit of resources.

Are there any better patterns that can prevent this from happening?

Was it helpful?

Solution

Move closing the InputStream into an onDoneEnumerating block, and use Enumerator.fromStream instead of Enumerator.outputStream:

val s3Object = s3Client.getObject(new GetObjectRequest(bucket, path))
val in = s3Object.getObjectContent()

val enumerator = Enumerator.fromStream(in).onDoneEnumerating {
  in.close()
}

fromStream reads chunks (8 KiB by default) at a time, so a BufferedInputStream is unnecessary.

Also, with outputStream, there is no way for the Iteratee to push back if it is slow to consume the data, which could result in a large amount of data (up to the size of the S3 object) buffering in memory. This is dangerous, because the application could run out of memory. With fromStream, the next read won't occur until the Iteratee is ready for more data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top