Keys and Functional Dependencies in Database

A key or candidate key is a set of attributes that uniquely identify a record in a relation. Here we discuss how to identify different keys and key attributes, when a set of functional dependencies F is given.

For a given relation R = {A1,A2,A3,………..,An} and a set of functional dependencies F, K is a subset of R and is known as key of R if closure of K, i.e. K+ →A1,A2,A3,… ,An and no subset of K i.e. X subset of K such that X+ → A1,A2,A3,…,An.

In simple words a subset K of R is known as key or candidate key of R if it satisfies both the following two conditions. The first condition is that closure of K, K+ contains all the attributes of relation R (K+ = A1,A2,A3,………..,An) and second condition is that for all subsets Y of K, (Y ⊆ K) , Y+ never contains all the attributes of relation R i.e. Y+ # A1,A2,A3,… ,An).

• If only one subset of R satisfy the above conditions, then it is known as Primary Key.
• If more than one subset of R satisfies the above conditions, then these subsets are known as Candidate Keys. In that case one of the candidate key is considered as Primary Key.
• A superset of candidate Key K is known as Super Key.

Note. A set of attributes of relation R that does not appear on right hand side of any FD in F is  a  Candidate  Key.

Prime and Non-Prime Attributes: For a given relation R= {A1,A2,A3,………..,An}, an attribute A is a prime attribute if A is a part of any candidate key of R otherwise A is a non-prime attribute.

Example. Consider a relation R(A, B, C, D, E, F, G, H) having a set of FD’s F = {D→AB,B→A, C→A, F→G, H→FGD, E→A}. What are the candidate keys of relation R? Also find prime and non-prime attributes.

Sol. Consider a subset K={C, E, H} of R. CEH is a candidate key as all attributes does not appear on right hand side of any  FD in F. It can also be  proved as follows.

Compute K+ which gives ABCDEFH. Now compute closure of every subset of K such as CE, EH and CH.

First consider Y= {C,E}, the closure Y+ = AC ≠ R.

Second consider Y={E,H}, the closure Y+ = ABDEFGH ≠ R.

Finally, consider Y= { C,H}, the closure Y+ = ABCDFGH ≠ R.

Thus, CEH is  a candidate key.

Prime Attributes are = CEH

Non-Prime Attributes are = ABDFG

Example. Consider a relation R(A,B,C,D,E) having a set of FD’s F= { A→BC, CD→E,B→D,E→A}. List all the candidate keys of relation R. Also find primary key, prime attributes and non-prime attributes.

Sol. There are 5 attributes, which gives a large number of subsets of R. To reduce the number of subsets to be considered, we should start with single attributes, then take subsets mentioned on LHS of all FD’s. If they are candidate keys, then ignore all subsets having these attributes (2nd condition of candidate key).

First of all try all individual attributes

Try A, then A+ = ABCDE = R, So A is a candidate key.

Try B, then B+ = BD ≠ R, So B is not a candidate key.

Try C, then C+ = C ≠ R, So C is not a candidate key.

Try D, then D+ = D ≠ R, So D is not a candidate key.

Try E, then E+ = ABCDE = R, So E is a candidate key.

Now try CD, then CD+ = ABCDE = R, so CD is a candidate key.

You can ignore all the subsets having A, E and CD. Thus only two subsets are left i.e.

BC and BD

Try BC, then BC+= ABCDE =R, so BC is a candidate key. Finaly, try BD, then BD+ =BD # R, So B is not a candidate key.

So, candidate  keys  of  R  are A, E,  BC  and  CD.

Consider one of them as primary keys. Take BC as a primary key, so Prime Attributes are B  and  C  and  non-prime attributes  are A,D  and  E.

Example. Consider the relation schema R = (A, B, C, D, E) and the set of functional dependencies:

AD → BE, CD → E, B → AE, AE → C, C → D

• List all of the candidate keys for
• The attribute set ABCDE is not a candidate key of Why not?

Sol.

• There are four candidate keys for R:  B, AC, AD,  and

(Hint: to generate the candidate keys, calculate closures of single attributes, then groups of two that do not contain a key, then groups of three, etc.)

• Lots of reasons, for example since B is a key, anything containing B cannot be a candidate key.

Example. Suppose you are given a relation R(A, B, C, D). For each of the following sets of FDs, assuming they are the only dependencies that hold for R, identify the candidate key(s) for R.

1. B→C, D→A.
3. A→B, B→C, C→D

Sol. 1.  Candidate key: BD

1.  Candidate keys: A and C
2. Candidate key: A

Example. Consider the following relational schema R(A, B, C, D, E) and set of functional dependencies AB → E and D → C.

List all superkey(s) for this relation. Which of these superkeys form a key (i.e., a minimal superkey) for this relation? Justify the answer in terms of functional dependencies and closures.

Sol. A superkey is a set of attributes X s.t. X+ = all attributes. From the  FDs  above,  we  can  derive:

{A, B, D}+ = {A, B, C, D}+ = {A, B, D, E}+ = {A, B, C, D, E}+ = {A, B, C, D, E}

Hence, {A,B,D},  {A,B,C,D}, {A,B,D,E},  and {A,B,C,D,E}  are all  superkeys.

A key is a set of attributes which form a superkey and for which no subset is a superkey. In this example, {A,B,D} is the only key.

Source: Gupta Satinder Bal, Mittal Aditya (2017), Introduction to Basic Database Management System, 2nd Edition-University Science Press (2017)