Question

1NF rule says that we should not keep data of different types in one single column. Does it mean that sql_variant is not compatible with first normal form hence should not be used?

Was it helpful?

Solution

1NF rule says that we should not keep data of different types in one single column.

There's no general agreement about exactly what first normal form (1NF) means in detail. Personally, I prefer Chris Date's current interpretation, which simply states that all relation variables (relvars) are in 1NF by definition.

Does it mean that sql_variant is not compatible with first normal form hence should not be used?

This is a bit imprecise since normal forms are properties of the logical model, whereas sql_variant is a physical (SQL Server-specific) type. We would need to define a mapping from relational model type (or domain) to physical implementation for this to truly make sense.

Nevertheless, I will say that sql_variant is not inherently an indication that the associated model violates 1NF - so long as the model defines the range of allowable attribute values to match those storable in sql_variant (with or without suitable constraints). The fact that sql_variant can store values of several different underlying types is neither here nor there. We might as well argue that float types violate 1NF since they can contain integers as well as reals etc.

In other words, a physical table using sql_variant can be a valid representation of a relation variable (within the general limitations of SQL anyway). It simply depends on what the underlying logical model says about the range of values allowed in the attribute.

Consider also that whatever we might physically store in a sql_variant might also be stored in a string representation, or as a binary value. String and binary values do not necessarily violate 1NF, so why should sql_variant? Similar arguments can be made about xml or json (and array types for databases that support such things).

All of the above is orthogonal to the question of whether using sql_variant is a feature of a "good" design/implementation or not.

Further reading:

Footnote

In relational maths, some domains may have relations as elements. If such a domain were used as the basis for an attribute in another relation, the result would not be in what Codd originally called normal form. Removing such relations from consideration without reducing the expressive power of the model was the whole point. Relvars eliminate this kind of nesting by definition, as stated. Beginners often get around the restriction in physical design by storing CSV strings in columns. There are consequences of such designs with regard to keyed access to data.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top