What is canonical cover, closure and extraneous attribute?

https://stackoverflow.com/questions/10845987

12-06-2021
|

Question

I'm studying database concepts and there are 3 concepts that I don't understand: canonical cover, extraneous attribute and closure. I read the definition about canonical cover but I don't get the picture of how it relates to 3NF and BCNF. The definition of canonical cover appears to be that there are no extraneous attributes and extraneous attributes are attributes that don't change the closure of the set of functional dependencies and closure is the set of all functional dependencies implied by F, a set of functional dependencies.

But all this is a little fuzzy and I'd like to know both an intuitive definition and how to calculate

Canonical cover
Closure
Extraneous attribute

Functional dependency I believe I understand--it's like what would have been the PK in a table if we had those attributes in a table.

There is a rather extensive answer at database refinement - minimal cover of F (extraneous attributes) but I found it difficult to read all the set definitions and algebra and I'd rather have definitions in plain English.

For example, having the schema U={A,B,C,D,E,F,G} and the functional dependencies

AB → C
B → E
CF → D
C → A
B → F
CE → F
CD → B
B → C

Are the closures A+,B+,C+,D+,E+,F+ calculated this way?

A+ = A
B+ = BCDEF
C+ = A
D+ = D
E+ = E
F+ = F

If I'm not mistaken then BCDEFG is a superkey (”the whole key”) in 1NF/2NF but is it minimal (3NF)?

What else should be done to normalize this example to 1NF, 2NF and 3NF with the help of closures and canonical covers? Is canonical cover the same as minimal cover?

Solution

I know I'm late, but maybe someone drops by somewhen.

I think you made some mistakes:

For the closures:

B+ should be ABCDEF rather than BCDEF because of the FD C → A
C+ should be AC (the closure of an attribute always contains itself)
G+ is G, see reason of the second bullet

To calculate the canonical cover, follow this algorithm. You need to look at your list of functional dependencies:

Left reduction: try to remove all attributes of the left side of the arrow that are not necessary to create the same closure. To make the first example, AB → C, you can calculate ABs closure, which would be ABCDEF. You then try to remove A, ending up with B → C. Now you calculate the closure of B only, which is still ABCDEF -> you can remove A. At the end of this step, your FD should look like {B → C, B → E, C F → D, C → A, B → F, C E → F, C D → B, B → C, G → G}.
Now you make the same thing for the right hand side. Note that you can remove all attributes here if you want, ending up with the empty set. As an example, look at B → F: the closure of B is ABCDEF. If you remove the F from the functional dependency, ending up with B → ∅, you still got the same closure for B as before. Repeat that for the other FDs. You should end up with {B →∅, B → E, C F → D, C → A, B →∅, C E → F, C D → B, B → C, G →∅}.
Remove all FDs of the form X → ∅. You end up with {B → E, C F → D, C → A, C E → F, C D → B, B → C}.
Combine all FDs that have the same left-side of the arrow, which leads to a canonical cover of {B → C E, C F → D, C → A, C E → F, C D → B}.

For the superkeys: see this SO answer

OTHER TIPS

Are the closures A+,B+,C+,D+,E+,F+ calculated this way?

What happened to "G"? Its absence here is significant. Do you know why?

If I'm not mistaken then BCDEFG is a super key (”the whole key”) in 1NF/2NF but is it minimal (3NF)?

Superkey (one word, no spaces) doesn't mean the whole key; it just means a key. The set of all attributes is a trivial superkey, so {ABCDEFG} is a trivial superkey.

Since B->C and C->A (a transitive dependency), you can reduce the trivial superkey to {BCDEFG}. Several more reductions are possible, so {BCDEFG} isn't a minimal superkey. {BCDEFG} is a reducible superkey.

One of the minimal superkeys is {BG}. (I could say, "{BG} is an irreducible superkey.") There are other minimal superkeys.

What else should be done to normalize this example to 1NF, 2NF and 3NF with the help of closures and canonical cover?

Just in case you have a common misunderstanding of this, it's not generally possible to normalize to 2NF and no higher, or to normalize to 3NF and no higher. Eliminating partial-key dependencies ("normalizing to 2NF") can leave all your relations in 5NF.

The next step is to determine all the candidate keys (all the irreducible superkeys).

My answer is derived from the Algorithm given in Database System Concepts by Korth.

Are the closures A+,B+,C+,D+,E+,F+ calculated this way? Steps to calculate the closure of (A,B,C,D,E,F) under F let say take for B

result = {B}
repeat
for each functional dependency(e.g B -> E) in F do
begin
if B is subset of result
then result(i.e B) = result(i.e B) U {E}
end
until(result stops changing)

In this way the closures of the following will be: A+ = A

B+ = ABCDEF

C+ = AC

D+ = D

E+ = E

F+ = F

How to check an attribute is extraneous: An attribute A is extraneous in a dependency alpha(AB) -> beta(C) if

1) A belongs to beta(which is not in the current case) then create new FD F' = (F-{alpha -> beta}) U {alpha -> (beta - alpha)} and check if alpha+ under F'(**not F**) includes A, then A is extraneous in beta.

2) A belongs to alpha(which is correct), then create a new gamma{B} = alpha({AB}) - {A} and check if gamma+(i.e B+) under **F** i.e ABCDEF includes all attributes in beta({C}) and which is true. So A is extraneous in AB->C.

similarly check if C is extraneous in AB->C. So by above suggested algo

F' : AB -> NULL; B →E; CF →D; C →A; B →F; CE →F; CD →B; B →C
Compute AB+ under F' i.e ABCDEF which includes C . So Cis extraneous in AB-> C.

How to compute canonical cover?

Algo:

F' = F
If there is any FD like A->B and A->C then replace by A->BC(by union rule) Here the F' become : AB -> C; B-> CEF; C -> A; CD-> B; CE-> F; CF-> D
Find the extraneous(left/right) in F' i.e A is extraneous in AB->C so remove A from AB->C , so that it becomes B->C and update F'.
Now check if the F' is changed as previous. If changed goto step 2 and repeat until find the F' no longer changes.

(Explaining further iterations as below: itr2: F' : B -> C; B-> CEF; C -> A; CD-> B; CE-> F; CF-> D F' : B-> CEF; C -> A; CD-> B; CE-> F; CF-> D Now check C in B-> CEF, which is not extraneous Check E , which is also not extraneous. check F ,which is extraneous. So new F' : B-> CE; C -> A; CD-> B; CE-> F; CF-> D

itr3: F' : B-> CE; C -> A; CD-> B; CE-> F; CF-> D after this there is no further extraneous attribute found.

So the canonical cover of F are :

B-> CE

C -> A

CD-> B

CE-> F

CF-> D

Let me know if have some error in logic suggested above.

yes Canonical cover is same as minimal cover. and all the closures are correct

to make the example in 3NF..

just find the canonical cover
make relations of each individual fds. ie.. if Fc={AB->C,C->D} then make R1(ABC) & R2(CD).
Now check if candidate key is contained in any one relation. if yes then its in 3NF and if no then add one more relation that contains only that candidate key. and its done..!!

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow