Question

Does a StructuredProperty reference the parent or child?

class Invoices(ndb.Model): #Child

class Customers(ndb.Model): #Parent
    invoice = ndb.StructuredProperty(Invoices)

or...

class Customers(ndb.Model): #Parent

class Invoices(ndb.Model): #Child
    customer = ndb.StructuredProperty(Customers)
Was it helpful?

Solution

To answer your question in the context of "what is the better practice for a NoSQL Datastore", here's what I can offer.

First, you probably want to name your models in the singular, as they should describe a single Invoice or Customer entity, not several.

Next, using a StructuredProperty implies that you'd like to keep all of this information in a single entity - this will reduce write/read ops, but can introduce some limitations. (See the docs - or this related question)

The most common relationship would be a one(Customer) to many(Invoice) relationship, which you can structure as below:

class Invoice(ndb.Model): #Child
    invoice_id = ndb.StringProperty(required=True)  # 

class Customer(ndb.Model): #Parent
    invoices = ndb.StructuredProperty(Invoices, repeated=True)

    def get_invoice_by_id(self, invoice_id):
        """Returns a customer Invoice by invoice_id. Raises KeyError if invoice is not present."""
        invoice_matches = [iv for iv in self.invoices if iv.invoice_id == invoice_id]
        if not invoice_matches: raise KeyError("Customer has no Invoice with ID %s" % invoice_id)
        return invoice_matches[0]  # this could be changed to return all matches

Keep in mind the following restrictions of this implementation:

  1. StructuredPropertys can not contain repeated properties inside of themselves.
  2. The complexity for keeping invoice_id globally unique is going to be higher than if Invoice were in its own entity-group. (invoice_key.get() is always better than the query this requires))
  3. You would need an instance method on Customer to find an Invoice by invoice_id.
  4. You would need logic to prevent invoices with the same ID from existing on a single Customer

Here are some of the advantages:

  1. You can query for Customer
  2. Querying an Invoice by invoice_id will return the Customer instance, along with all invoices. (This could be a pro and a con, actually - you need logic to return the invoice from the customer)

Here is a more common solution, but by no means is it necessarily the "right solution." This solution uses ancestor relationships, that allow you to keep writes to Invoice and the related Customer atomic - so you could maintain aggregate invoice statistics on the Customer level. (total_orders, total_gross, etc.)

class Invoice(ndb.Model):
    customer = ndb.ComputedProperty(lambda self: self.key.parent(), indexed=False)  # when not indexed, this is essentially a @property

class Customer(ndb.Model):
    def get_invoice_by_id(self, invoice_id):
        """Returns a customer Invoice by invoice_id. Raises KeyError if invoice is not present."""
        invoice_key = ndb.Key(Invoice._get_kind(), invoice_id, parent=self.key)
        return invoice_key.get()

    def query_invoices(self):
        """Returns ndb.Query for invoices by this Customer."""
        return self.query(ancestor=self.key)

invoice = Invoice(parent=customer.key, **invoice_properties)

Good luck with Appengine! Once you get the hang of all of this, it is a truly rewarding platform.

Update:

Here is some additional code for transactionally updating customer aggregate totals as I mentioned above.

def create_invoice(customer_key, gross_amount_paid):
    """Creates an invoice for a given customer.

    Args:
        customer_key: (ndb.Key) Customer key
        gross_amount_paid: (int) Gross amount paid by customer
    """

    @ndb.transactional
    def _txn():
        customer = customer_key.get()
        invoice = Invoice(parent=customer.key, gross_amount=gross_amount_paid)

        # Keep an atomic, transactional count of customer aggregates
        customer.total_gross += gross_amount_paid
        customer.total_orders += 1

        # batched put for API optimization
        ndb.put_multi([customer, invoice])  

        return invoice

    return _txn()

The above code works in a single entity group transaction (e.g. ndb.transactional(xg=False)) because Invoice is a child entity to Customer. If that connection is lost, you would need xg=True. (I'm not sure if it's more expensive, but it is less optimized)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top