Question

There is a large table products. I need to add boolean flag disabled to my product model. Normally I would just add a new field to existing table. But this attribute will be used very rarely, and given the number of records in the table this new field will make an unnecessary hit on performance and disk space.

So I decided to do a kind of 1NF normalization for one-to-one relation (i.e. move this field to another table having foreign key referencing products; I don't know if it's really 1NF — this is a part of my question). But I don't actually need true and false values for each product's disabled attribute, because this implies that relation's table size will equal to the size of products. So there's no need in relation table's value field. So my schema is:

CREATE TABLE products (
  id INT PRIMARY KEY,
  name VARCHAR
);

CREATE TABLE disabled_products (
  product_id INT NOT NULL,
  CONSTRAINT fk_product FOREIGN KEY (product_id) REFERENCES products (id) ON DELETE CASCADE
);

(SQLFiddle to fiddle around).

Thus I receive exactly what I wanted - the value is stored only for those rare cases when the flag is set. Behind the scenes the flag is represented not by table column, but by very presence of a record in disabled_products for a given product.

Just want to know if I am doing right.

What are possible drawbacks of such design, if any?

Does it fit to relational model (by it I mean this way of normalization in general and a table consisting of single foreign key column in particular)? And if yes, how do you call this solution in terms of RDB science?

Was it helpful?

Solution

People often ask themselves if they should split a table or just create one large table. I'll speak to the general logic at the bottom. In your case it's purely a well defined performance calculation.

Pros & Cons of your solution 1) CON: Even though records are rarely disabled you'll need to LEFT JOIN this table to every query that needs active records.

2) CON: The space savings is 1 byte which is most likely negligible, so why bother.

3) PRO: It saves you from having to alter a table which may be a problem for you.

Recommendation: Given the above pros and cons, I'd recommend just adding a field to your table. It's just a byte.

In general, people split their tables vertically when it is impractical to even modify a table or when there are different classes of records that each need a specific set of fields, or when for performance reasons they want to partition their table.

OTHER TIPS

The possible drawbacks are obvious - in contrast to storing it as a column value, you can not assume every row has a disabled value (so you have to apply some custom logic) and you cannot inner join the two tables because the smaller table would reduce your result set significantly. Also, the additional table does not contain any significant information but stating the disable-state and is therefore purely redundant, but in terms of "redundancy in favor of performance" I feel somehow reminded of a star schema (http://en.wikipedia.org/wiki/Star_schema).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top