Question

Is there a way to maintain/work with a persistent connection for a POST command in rails?

I'd like to create an API where my app accepts what amounts to a stream of data from an external service (I'm writing this external service, so I can be flexible in my design here). Speed is critical. I need to get the information from the external source at a rate of 1000+ points per second. Talking with some fellow computer scientists, one came up with the idea of using a persistent connection so that the expensive TCP hand-shake would only have to be performed once. Using a library within the external service, I would then create multiple POST items that are pushed into my rails app and then process those POST items one by one.

My understanding of the rails paradigm is that each request (POST, GET, PUT, etc) takes one TCP connection. Is there a way I could utilize one TCP connection to get multiple POSTs?

I'm currently using the following:

  • Rails 3.2
  • Ruby 1.9.3 (Could switch to 2.0 if necessary)

EDIT

To help clarify what my goal is:

I have an external system that collects 1,000 data points a second (3 floating point numbers, a timestamp, and 2 integers). I'd like to push that data to my Ruby on Rails server. I'm hoping with a properly configured system I could just use the HTTP stack in real time (as a data point is collected, I push it to my rails server). I could also slow this rate of transmission down and group data points together to send them. I've looked at using messaging queues, but I'd like to see if I could write a more "standard" HTTP API before going to a specialized queue API.

Was it helpful?

Solution

I think the Net::HTTP::Persistent library is what you are looking for. There's also this library going one step further by implementing connection pools over persistent connections. But since it sounds like you just had one API point, this might be overkill.

Some additional thoughts: If you really look into raw speed, it might be worth to send a single multipart POST request to further reduce the overhead. This would come down to implementing a reverse server push.

For this to work, your rails app would need to accept a chunk-encoded request. This is important as we are continuously streaming data to the server without any knowledge how long the resulting message body will ultimately be. HTTP/1.1 requires all messages (that is responses and requests) to be either chunk-encoded or have their body size specified by a Content-Length header (cf RFC 2616, section 4.4). However, most clients prefer latter option which results into some webservers not handling chunk-encoded requests well (e.g. nginx hasn't had this implemented before v1.3.9).

As a serialization format, I can safely recommend JSON, which is really fast to generate and widely accepted. An implementation for RoR can be found here. You might want to have a look at this implementation as well as it is natively working with streams and might thus be better suitable. If you find that JSON doesn't suit your needs, give MessagePack a try.

If you hit network saturation, it could be worth to investigate the possibilities for request compression.

Everything put together, your request could look like this (compression and chunk-encoding stripped for the sake of legibility):

POST /api/endpoint HTTP/1.1
Host: example.com
Content-Type: multipart/mixed; boundary="---boundary-"
Transfer-Encoding: chunked
Content-Encoding: deflate

---boundary-
Content-Type: application/json

{...}
---boundary-
Content-Type: application/json

{...}
---boundary---

The mime type is multipart/mixed as I felt it were the most appropriate one. It actually implies the message parts were of different content types. But as far as I can see, this is nowhere enforced, so multipart/mixed is safe to use here. deflate is chosen over gzip as compression method as it doesn't need to generate a CRC32 checksum. This allows for a speed boost (and saves a few bytes).

OTHER TIPS

I know you want a HTTP solution, but honestly if speed is critical, I would take HTTP out of the equation. Web sockets seem to adapt to this problem much better.

See an example app from Heroku: https://devcenter.heroku.com/articles/ruby-websockets

And in general see Twitter stream API for an inspiration: https://dev.twitter.com/docs/streaming-apis

On top of that, you could transfer binary data instead of text, speeding up the transfer further and then have workers that ingest and save the data.

Just my 2cents

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top