Question

We have a shopping cart as pictured below, The setup works well, except for one fatal flaw. If you place an order the order is linked to a product, so If I update the product after you have purchased the product there is no way for me to show you want the product looked like when you bought it (including price). This means we need versioning.

Current Schema

My plan at present is to, when a new product, or variant is created, or an existing one is edited, create a duplicate of the product or variant in the database. When a purchase is made, link the order to the version, not the product.

This seems rather simple, except from what I can see the only things we don't need to version are the categories (as no one cares what categories it was in.). So we need to version:

  • Products
  • Variants
  • The key -> value pairs of attributes for each version
  • The images

My current thinking is,

note: When a product is created a default variant is created as well, this cannot be removed.

  • When a product is created
    • Insert the product into the products table.
    • Create the default variant
    • Duplicate the product into the products_versions table
      • Replace current id column with a product_id column
      • Add id column
    • Duplicate the variant into the variants_versions table
      • Replace current id column with variant_id column
      • Add id column
      • Replace product_id column with product_version_id column

  • When a product is edited
    • Update the product into the products table.
    • Duplicate the product into the products_versions table
      • Replace current id column with a product_id column
      • Add id column
    • Duplicate all product variants into the variants_versions table
      • Replace current id column with variant_id column
      • Add id column
      • Replace product_id column with product_version_id column
    • Duplicate all variant_image_links into the variant_Image_link_version table
      • Replace current variant_id column with variant_version_id column

  • When a variant is added
    • Add the variant into the variants table.
    • Duplicate the product into the products_versions table
      • Replace current id column with a product_id column
      • Add id column
    • Duplicate all product variants into the variants_versions table
      • Replace current id column with variant_id column
      • Add id column
      • Replace product_id column with product_version_id column

  • When a variant is edited
    • Update the variant in the variants table.
    • Duplicate the product into the products_versions table
      • Replace current id column with a product_id column
      • Add id column
    • Duplicate all product variants into the variants_versions table
      • Replace current id column with variant_id column
      • Add id column
      • Replace product_id column with product_version_id column
    • Duplicate all variant_image_links into the variant_Image_link_version table
      • Replace current variant_id column with variant_version_id column

So the final structure looks like Full Size

Now this all seems great, except it seems like a heck of a lot of duplicated data, e.g. if we update a product we duplicate the variants even though they would not have been updated since they were inserted. Also, this seems like a lot of work.

Is there a better way of doing this?

Was it helpful?

Solution

You can do what ERP (and also possibly Payroll) systems do: Add a Start and End Date/Time. So...

  • the variant and prices match with their product based on the common dates.
  • all queries default to running on current date and the joins between each table need to also take into account the overlapping/intersecting date ranges. parent_start_date <= child_start_date AND parent_end_date >= child_end_date
  • You would end up with duplicated rows for each price change or variant but you then don't need to keep update as many records (like variant ids) when the product price changes.
  • Need to ensure valid dates are used. PS: Use your system's max date for the End datetime of the most current/recent record.

Btw, some related questions along the same line:

OTHER TIPS

Another approach to this would be to never edit or remove your data, only create new data. In SQL terms, the only operations you ever run on your tables are INSERTs and SELECTs.

To accomplish what you want, each table would need the following colums:

  • version_id - this would be your primary key
  • id - this would be the thing that holds versions of your object together (e.g. to find all versions of a product, SELECT * FROM products WHERE id = ?)
  • creation_date
  • is_active - you're not deleting anything, so you need to flag to (logically) get rid of data

With this, here's what your products table would look like:

CREATE TABLE products (
  version_id CHAR(8) NOT NULL PRIMARY KEY,
  id INTEGER NOT NULL,
  creation_date TIMESTAMP NOT NULL DEFAULT NOW(),
  is_active BOOLEAN DEFAULT true,
  name VARCHAR(1024) NOT NULL,
  price INTEGER NOT NULL
);

CREATE TABLE variants (
  version_id CHAR(8) NOT NULL PRIMARY KEY,
  id INTEGER NOT NULL,
  creation_date TIMESTAMP NOT NULL DEFAULT NOW(),
  is_active BOOLEAN DEFAULT true,
  product_version_id CHAR(8) NOT NULL,
  price INTEGER NOT NULL,
  override_price INTEGER NOT NULL,
  FOREIGN KEY (product_version_id) REFERENCES products(version_id)
);

Now, to insert into either table

  1. Generate a unique version_id (there are several strategies for this, one is to use a database sequence, or for MySQL use ant AUTO_INCREMENT).
  2. Generate an id. This id is consistent for all versions of a product.

To update a row in a table, one must insert the entire graph e.g. to update a product, one must insert a new product, and new variants. (There is a lot of room for optimization here, but it's easiest to start with the un-optimized solution.)

For example, to update a product

  1. Generate a unique version_id
  2. Use the same id
  3. Insert new product variants. The variants will be the same as the ones linked to the previous version of the product that you're "updating", except the product_version_id will be different.

This principal can extend to all your tables.

To find the most recent version of a product, you need to use the creation_date column to get the product that was most recently created.

This model will use more space, but I think this may be a fair trade-off given it's simplicity: there are only INSERTs and SELECTs and data is never mutated.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top