Candidate Key Identification with Functional Dependencies

https://stackoverflow.com/questions/7782014

10-02-2021
|

Question

I'm having trouble understanding how to identify keys in functional dependencies. I've been looking at examples, for example:

Given a relation ABCD, find all keys not including superkeys of the

A -> BC, C -> D, CD -> AB.

This gives keys C and A. The way I thought this problem was approached was that BC and D both depend on A and C, and AB depends on CD, meaning all three of them are keys, but since CD is a superkey (C is a subset that is also a key), CD is not considered a minimal superkey.

However, in another example,

ABCDE
AB → CD
E → A
D → A

The only key here is apparently BE. Why is this true, and can anyone clarify the steps to take in finding keys with these problems?

Thanks.

Solution

The second one's a bit simpler, so taking it first . . . you know that B must be in any key, because it's not on any right-hand side. (That is, even if you have the values of ACDE, you couldn't infer the value of B.) Similarly for E; so, any key must include BE. But BE is by itself a sufficient key, because E gives you A (hence BE → ABE) and AB gives you CD (hence BE → ABCDE).

In the first one, we can see that A is a key, because A gives you B and C, and C gives you D. Similarly, C is a key, because C gives you D, and C and D together give you A and B. Conversely, we see that A and/or C must be in any key, because every left-hand-side includes at least one of them.

OTHER TIPS

A procedure that is a little more formal.

Take an FD, e.g. (example 2), AB -> CD.

Augment this using trivial FDs until you have ALL the attributes on the RHS.

You lack ABE on the RHS, so you must augment using the trivial FD ABE -> ABE to obtain ABE -> ABCDE.

That tells you that ABE is a superkey, because knowing the values in a certain row for ABE will be sufficient to determine the values for all attributes in that row.

Now inspect the OTHER FDs to see if any of them allow you to reduce the LHS (ABE) in this case. E -> A allows you to remove A from ABE, keeping thus only BE -> ABCDE. The rule for reduction is : if the LHS of another FD (E) is a proper subset of the superkey you are trying to reduce (ABE), then you can remove all the attributes from the superkey that are mentioned ONLY in the RHS of that other FD (you cannot remove E if you are looking at an "other" FD such as E -> EA !!!).

This procedure is not apt for mechanical implementation, because you might also have to take a look at "combinations" of other FDs. However, most use cases and even most fabricated class exercises typically are not complex enough to cause this procedure to fail (i.e. leaving you with a proper superkey instead of an irreducible one).

(PS to find ALL the keys, you will need to apply this to ALL given FDs)

Reference http://www.ict.griffith.edu.au/~jw/normalization/assets/Functional%20Dependencies%20and%20Normalization.pdf

Proof of Algorithm (short and straightforward, section 3)
H. Saiedian and T. Spencer, “An Efficient Algorithm to Compute the Candidate Keys of a Relational Database Schema,” The Computer Journal, vol. 39, no. 2, Feb. 1996 [Online]. Available: https://pdfs.semanticscholar.org/ab3f/f65010b50d78d583b1c2b6ce514fa072d23d.pdf. [Accessed: 31-Jul-2019]

Video Explanation
https://www.youtube.com/watch?v=s1DNVWKeQ_w

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow