Question

I'm reading the paper for Google's Spanner DB. This appears to address some similar problems to Rich Hickey's Datomic.

Does Google's Spanner DB implement a concept of Epochal Time?

Was it helpful?

Solution

Summary: I think so, but I'm not really sure what the "Epochal Time" concept actually is.


I watched the whole video referenced in the question (fortunately, it's an interesting video) without ever seeing a definition of the concept of "Epochal Time" (or what an "epoch" might be) other than that Hickey views the future as a pure function of the past (at least, in terms of databases)[1], or more precisely, as the composition of a series of transaction functions.

Reading between the lines, I believe that the core idea is that time can be divided into quanta, each of which represents the complete execution of a single transaction function. (I guess that Hickey would say that these quanta are "epochs", but maybe I'm wrong; the definition of "epoch" in the Datomic glossary is only somewhat related.) Since each transaction function effectively includes the assertion of its own execution, the transaction identifier can be considered a proxy for a time quantum; indeed, transaction identifiers used with a single transactor might be forced to be a monotonically increasing sequence of numbers coincidentally related to some machine's clock (although not precisely equal, since individual clocks can, to time from time, skip backwards.)

So I'm interpreting the idea as two-fold:

  1. Attach a "timestamp" to every mutation; and

  2. Arrange for mutations to add to, rather than substitute, past data.

If so, then both Bigtable -- with some implementation limitations -- and Spanner do conform to Hickey's model.


Bigtable[2] provides a key-timestamp-value mapping, but leaves it up to each application to guarantee timestamp monotonicity. For applications which implement monotonic timestamps and use a single writer, it will look very similar to Datomic; it is also based on immutable datastructures and allows timestamp-based queries ("the past is a subrange of the present"). However, as the Spanner paper[3] indicates, Bigtable does not provide synchronous updates so there is no guarantee that two different keys read from a replica will have the same past subrange. Since this apparently led Google internal teams to use expensive, slow alternatives to Bigtable, Spanner was designed to also provide synchronous updates in a relatively efficient fashion, even at the cost of making transactions more expensive. If I understand the Spanner paper correctly, part of this cost is that a mutator can not rely on communicating with a locally available transactor, since each segment of the database has a single elected transaction leader at any given point in time.

Although Spanner offers an "SQL-like" API, its internal datastore, like Bigtable, is key-timestamp-value. Unlike Bigtable, the timestamp is provided by the transactor and is kept within a carefully monitored deviation from the real time (Google apparently purchased its own atomic clocks, which are "not that expensive", to help maintain this guarantee). Datomic is, by design, a single transactor system but allows the configuration of a standby transactor for high-availability. (Only one, if I'm reading the documentation correctly.) This makes time synchronization much easier, and it also uses the real time as a timestamp.

All three database systems, to some extent or another, conceptually provide time-ordered mutations. They differ in their guarantees about the consistency and monotonicity of timestamps on separate mutations, and also in their actual ability to provide global write-read consistency, but all of them do satisfy the same fundamental feature by Hickey in the first few minutes of his presentation: mutations ("updates") are part of the datamodel, easily explained, and fundamentally non-destructive.


[1]: At about 19 minutes into the video, Hickey states that the "epochal time model" is just a phrase he came up with, and has no formal definition.

[2]: About 42 minutes into his presentation, Hickey describes the Bigtable architecture, apparently as an example of what he's talking about. Spanner is apparently a successor technology, which extends but does not replace the underlying datamodel.

[3]: PDF: Scanner paper from OSDI 2012

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top