Question

Since the Google App Engine Datastore is based on Bigtable and we know that's not a relational database, how do you design a database schema/data model for applications that use this type of database system?

Was it helpful?

Solution

Designing a bigtable schema is an open process, and basically requires you to think about:

  • The access patterns you will be using and how often each will be used
  • The relationships between your types
  • What indices you are going to need
  • The write patterns you will be using (in order to effectively spread load)

GAE's datastore automatically denormalizes your data. That is, each index contains a (mostly) complete copy of the data, and thus every index adds significantly to time taken to perform a write, and the storage space used.

If this were not the case, designing a Datastore schema would be a lot more work: You would have to think carefully about the primary key for each type, and consider the effect of your decision on the locality of data. For example, when rendering a blog post you would probably need to display the comments to go along with it, so each comment's key would probably begin with the associated post's key.

With Datastore, this is not such a big deal: The query you use will look something like "Select * FROM Comment WHERE post_id = N." (If you want to page the comments, you would also have a limit clause, and a possible suffix of " AND comment_id > last_comment_id".) Once you add such a query, Datastore will build the index for you, and your reads will be magically fast.

Something to keep in mind is that each additional index creates some additional cost: it is best if you can use as few access patterns as possible, since it will reduce the number of indices GAE will construct, and thus the total storage required by your data.

Reading over this answer, I find it a little vague. Maybe a hands-on design question would help to scope this down? :-)

OTHER TIPS

You can use www.web2py.com. You build the model and the application once and it works on GAE but also witl SQLite, MySQL, Posgres, Oracle, MSSQL, FireBird

As GAE builds on how data is managed in Django there is a lot of info on how to address similar questions in the Django documentation (for example see here, scroll down to 'Your first model').

In short you design you db model as a regular object model and let GAE sort out all of the object-relational mappings.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top