Question

I know that key-value pairs aren't good database design, aren't normalized etc however in this case I believe they are the most appropriate solution.

My excuse for this, and some background: A large set of items is being pushed into a set of tables, and each item can be tagged with arbitrary metadata that the user can choose. The user can choose the metadata because they are specifying how they wish to categorize, report on and view the items later. For this specific business problem it is not our place (as systems designers) to say what these dimensions are. There's not a consistent set of keys used across items and in some cases the presence of a certain key will be used as a filter condition.

Another bit of background info, the entries will be INSERTed, but not UPDATEd. Eventually they will be DELETEd (sequentially, in the same order they were inserted).

The question, "Efficient storage": by this I am referring to query (read) performance. The following types of queries will be used:

  • Get items with the given key, any value
  • Get items with the given key and value
  • Get items with all key names
  • Get items with all key names and values

Basically, which is the best choice given these options?:

OPTION 1

Items table:
item_id (integer, pk)
... item fields ...

ItemFacts table:
item_id (integer, fk)
key_name (nvarchar(64))
key_value (nvarchar(128))

OPTION 2

Items table:
item_id (integer, pk)
... item fields ...

Facts table:
fact_id (integer, pk)
key_name (nvarchar(64))
key_value (nvarchar(128))

ItemFacts table:
item_id (integer, fk)
fact_id (integer, fk)

(There could be a third option where the key names are pulled out to a separate table again to reduce redundancy since there may be a whole load of used/possible values for a given key name, might also be worth considering)

roughly speaking, there will be a great deal of duplicated key/value matches. As such there should be a storage efficiency increase. I realise this is a bit of an open-ended question, but what about read performance? How about if I introduce this query too:?

  • Get items where the value for the given key begins with 'x'

If I can provide any more clarification, please let me know.

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top