Question

So this is the setup I'm working with:

  1. I am on an express server which must stream an archived binary payload to a browser (does not matter if it is zip, tar or tar.gz - although zip would be nice).
  2. On this server, I have a websocket open that connects to another server which is sending me binary payloads of individual files in a directory. I get these payloads streamed, piece-by-piece, as buffers, and I'm doing this serially (that is - file-by-file - there aren't multiple websockets open at one time, and there is one websocket per file). This is the websocket library I'm using: https://github.com/einaros/ws

I would like to go through each file, open a websocket, and then append the buffers to an archiver as they come through the websockets. When data is appended to the archiver, it would be nice if I could stream the ouput of the archiver to the browser (via the response object with response.write). So, basically, as I'm getting the payload from the websocket, I would like that payload streamed through an archiver and then to the response. :-)

Some things I have looked into:

  1. node-zipstream - This is nice because it gives me an output stream I can pipe directly to response.write. However, it doesn't appear to support nested files/folders, and, more importantly, it only accepts an input stream. I have looked at the source code (which is quite terse and readable), and it seems as though, if I were able to have access to the update function within ZipStream.prototype.addFile, I could just call that each time on the message event when I get a binary buffer from the websocket. This is quite messy/hacky though, and, given that this library already doesn't seem to support nested files/folders, I'm not sure I will be going with it.
  2. node-archiver - This suffers from the same issue as node-zipstream (probably because it was inspired by it) where it allows me to pipe the output, but I cannot append multiple buffers for the same file within the archive (and then manually signal when the last buffer has been appended for a given file). However, it does allow me to have nested folders, which is a clear win over node-zipstream.

Is there something I'm not aware of, or is this just a really crazy thing that I want to do?

The only alternative I see at this point is to wait for the entire payload to be streamed through a websocket and then append with node-archiver, but I really would like to reap the benefit of true streaming/archiving on-the-fly.

I've also thought about the possibility of creating a read stream of sorts just to serve as a proxy object that I can pass into node-archiver and then just append the buffers I get from the websocket to this read stream. Looking at various read streams, I'm not sure how to do this though. The only way I could think of was creating a writestream, piping buffers to it, and having a readstream read from that writestream. Am I on the correct thought process here?

As always, thanks for any help/direction you can offer SO community.

EDIT:

Since I just opened this question, and I'm new to node, there may be a better answer than the one I provided. I will keep this question open and accept a better answer if one presents itself within a few days. As always, I will upvote any other answers, even if they're ridiculous, as long as they're correct and allow me to stream on-the-fly as mine does.

Was it helpful?

Solution

I figured out a way to get this working with node-archiver. :-)

It was based off my hunch of creating a temporary "proxy stream" of sorts, inspired by this SO question: How to create streams from string in Node.Js?

The basic gist is (coffeescript syntax):

archive = archiver 'zip'
archive.pipe response // where response is the http response

// and then for each file...
fileName = ... // known file name
fileSize = ... // known file size
ws = .... // create websocket
proxyStream = new Stream()
numBytesStreamed = 0

archive.append proxyStream, name: fileName

ws.on 'message', (dataBuffer) ->
    numBytesStreamed += dataBuffer.length
    proxyStream.emit 'data', dataBuffer

    if numBytesStreamed is fileSize
        proxyStream.emit 'end'
        // function/indicator to do this for the next file in the folder

// and then when you're completely done...
archive.finalize (err, bytesOfArchive) ->
    if err?
        // do whatever
    else
        // unless you somehow knew this ahead of time
        res.addTrailers
            'Content-Length': bytesOfArchive
        res.end()

Note that this is not the complete solution I implemented. There is still a lot of logic dealing with getting the files, their paths, etc. Not to mention error-handling.

EDIT:

Since I just opened this question, and I'm new to node, there may be a better answer. I will keep this question open and accept a better answer if one presents itself within a few days. As always, I will upvote any other answers, even if they're ridiculous, as long as they're correct and allow me to stream on-the-fly as mine does.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top