Question

How do we implement agregation or composition with NDB on Google App Engine ? What is the best way to proceed depending on use cases ? Thanks !

I've tried to use a repeated property. In this very simple example, a Project have a list of Tag keys (I have chosen to code it this way instead of using StructuredProperty because many Project objects can share Tag objects).

class Project(ndb.Model):
    name = ndb.StringProperty()
    tags = ndb.KeyProperty(kind=Tag, repeated=True)
    budget = ndb.FloatProperty()
    date_begin = ndb.DateProperty(auto_now_add=True)
    date_end = ndb.DateProperty(auto_now_add=True)

    @classmethod
    def all(cls):
        return cls.query()

    @classmethod
    def addTags(cls, from_str):
        tagname_list = from_str.split(',')
        tag_list = []
        for tag in tagname_list:
            tag_list.append(Tag.addTag(tag))
        cls.tags = tag_list

--

Edited (2) : Thanks. Finally, I have chosen to create a new Model class 'Relation' representing a relation between two entities. It's more an association, I confess that my first design was unadapted.

Was it helpful?

Solution 2

It really does depend on the use case. For small numbers of items StructuredProperty and repeated properties may well be the best fit.

For large numbers of entities you will then look at setting the parent/ancestor in the Key for composition, and have a KeyProperty pointing to the primary entity in a many to one aggregation.

However the choice will also depend heavily on the actual use pattern as well. Then considerations of efficiency kick in.

The best I can suggest is consider carefully how you plan to use these relationships, how active are they (ie are they constantly changing, adding, deleting), do you need to see all members of the relation most of the time, or just subsets. These consideration may well require adjustments to the approach.

OTHER TIPS

An alternative would be to use BigQuery. At first we used NDB, with a RawModel which stores individual, non-aggregated records, and an AggregateModel, which a stores the aggregate values.

The AggregateModel was updated every time a RawModel was created, which caused some inconsistency issues. In hindsight, properly using parent/ancestor keys as Tim suggested would've worked, but in the end we found BigQuery much more pleasant and intuitive to work with.

We just have cronjobs that run everyday to push RawModel to BigQuery and another to create the AggregateModel records with data fetched from BigQuery.

(Of course, this is only effective if you have lots of data to aggregate)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top