Why does Google Wave Operational Transform need annotations?

https://stackoverflow.com/questions/4085847

google-wave

28-09-2019
|

Question

The operational transform stuff used in Google Wave has a rather curious document format. A document is basically just an xml subset document - characters, start tags and end tags. In addition to that, the document has "annotations", which are meta-data associated with ranges, e.g. start position and end position. The white paper justifies their presence with:

Wave document operations also support annotations. An annotation is some meta-data associated with an item range, i.e., a start position and an end position. This is particularly useful for describing text formatting and spelling suggestions, as it does not unecessarily complicate the underlying structured document format.

I can certainly see how it would be somewhat difficult if an arbitrary range from a document would be selected and for example bolded - XML tag nesting is strict and that would cause a mess of open and close tag insertions.

However, is this really a problem in practise? I mean, does one necessarily have to support such operation, if not making an editor that basically mimics the years old word processing paradigm instead of being a structured editor? Would pure XML operational transform with the document structure as simply HTML5 be that terrible? Is it a performance issue that styles would be in the document as tags? Or does the operational transform model somehow produce unsatisfactory results on text formatting if they are represented with tags?

Also, a side question - how good would the pure "insert character, remove character, retain" operational transform model be on plain text representations? For example, editing HTML5 as text - or editing Wikipedia articles?

Solution

There are fundamental problems with using a hierarchical markup language with OT. See below for a worked example:

Does operational transformation work on structured documents such as HTML if simply treated as plain text?

OTHER TIPS

This choice makes sense to me as an optimization on several fronts:

The underlying document remains as human readable and parse-able as possible
The algorithms to parse the underlying XML remain as simple as possible (useful for compatibility with non-google attempts at parsing the resulting documents, and for maintenance)
The extra collected garbage, after multiple edits, can lead to large performance hits - due to the sheer number of tags and/or additional passes on the document to attempt to simplify it.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow