Question

I am hoping to use the Diff-Match-Patch algorithms available from google as apart of the Google-Mobwrite real time collaborative text editor protocol in order to embed a real time collaborative text editor in my program.

Anyways I was wondering what exactly might be the most efficient way of storing "global" copies of each document that users are editing. I would like to have each document stored on a server that is not local to any user and each time a user performs an "operation" ( delete insert paste cut ) that the diff is computed between their copy and the server and its patched etc... if you know the Google mobwrite protocol you probably understand what I am saying.

Should the servers text files be stored as a file that is changed or inside an sql database as a long string or what? Should I be using websockets to communicate with the server? I am honestly kind of an amateur when it comes to this but am generally a fast learner. Does anyone have any tips or resources I could follow perhaps? Thanks lot

No correct solution

OTHER TIPS

This would be a big project to tackle from scratch, so I suggest you use one of the many open source projects in this area. For example, etherPad:

https://code.google.com/p/etherpad/

Mobwrite is using Differential Synchronization technique and its totally different from Operational Transformation technique.

Differential Synchronization suppose to have a communication circle that always starts from the client(the browser), which means you cant use web-sockets to send diffs from the server directly. The browser needs to request the server frequently to get the updates (lets say every 2 seconds), otherwise your shadow-copies will be out of sync.

For storing your shadow-copies when the user is active, you can use whatever you want, but its better to to use in-memory DB (Redis) since you need fast access to do the diffs and patches. And when the user leaves the session you don't need his copy anymore. But, If you need persistence in you app, you should persist only the server-copy not the shadow-copy (shadow-copies are used to find-out the diffs), then you can use MySQL or whatever you like.

But for Operational Transformation technique there are some nice libs out there

NodeJS:

  • ShareJS (sharejs.org): supports all operations for JSON.
  • RacerJS: synchronization model built on top of ShareJS
  • DerbyJS: Complete framework that uses RacerJS as its model.

OpenCoweb (opencoweb.org): The server is either Java or Python, the client is built with Dojo

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top