Informal Design Guidelines for Relation Schemas in Database

There are four informal measures of quality for relation schema design. These measures are not always independent of one another. These Four informal measures are:

  1. Meaning (semantics) of the relation
  2. Reducing the redundant (repetitive) values in
  3. Reducing the null values in
  4. Not allowing the possibility of generating spurious

(i) Meaning of the relation attributes : When the attributes are grouped to form a relation schema, it is assumed that attributes belonging to one relation have certain real-word meaning and a proper interpretation associated with This meaning (Semantics) specifies how the attribute values in a tuple relate to one another. The conceptual design should have a clear meaning, if it is done carefully, followed by a systematic mapping into relations and most of the semantics will have been accounted for. Thus the easier it is to explain the meaning of the relation, the better the relation schema  design  will  be.

GUIDELINE 1 : Design a relation schema so that it is easy to explain its meaning. The attributes from multiple entity types and relationship types should not be combined into a single relation. Thus a relation schema that corresponds to one entity type or one relationship  type  has  a straight  forward  meaning.

(ii) Redundant information in tuples and update anomalies : One major goal of schema design is to minimize the storage space needed by the base A significant effect on storage space occurred, when we group attributes into relation schemas. The second major problem, when we use relations as base relations is the problem of update anomalies. There are mainly three types of update anomalies in a relation i.e., Insertion Anomalies, deletion anomalies and modification Anomalies. These anomalies are discussed in  the  section  6.4  in  detail.

GUIDELINE 2 : Design the base relation schema in such a way that no updation anomalies (insertion, deletion and modification) are present in the relations. If present, note them and make sure that the programs that update the database will operate correctly.

(iii) Null values in tuples : When many attributes are grouped together into a “fat” relation and many of the attributes do not apply to all tuples in the relation, then there exists many NULL’S in those This wastes a lot of space. It is also not possible to understand the meaning of the attributes having NULL Values. Another problem occur when specifying the join operation. One major and most important problem with Null’s is how to account for them when aggregate operation (i.e., COUNT or SUM) are applied.  The Null’s can  have multiple  interpretations, like:

    1. The attribute does  not apply  to  this rule.
    2. The attribute is unknown for this tuple.
    3. The value is known but not present i.e., cannot be recorded.

GUIDELINE 3 : Try to avoid, placing the attributes in a base relation whose value may usually be NULL. If Null’s are unavoidable, make sure that apply in exceptional cases only and majority of the tuples  must have some not NULL Value.

(iv) Generation of spurious tuples : The Decomposition of a relation schema R into two relations R1 and R2 is undesirable, because if we join them back using NATURAL Join, we do not get the correct original The join operation generates spurious tuples that  represent  the invalid  information.

GUIDELINE 4 : The relation Schemas are designed in such a way that they can be joined with equality conditions on attributes that are either primary key or foreign key. This guarantees that no spurious tuples will be generated. Matching attributes in relations that are not (foreign key, primary key) combinations must be avoided, because joining  on  such  attributes  may  produce spurious  tuples.

Source: Gupta Satinder Bal, Mittal Aditya (2017), Introduction to Basic Database Management System, 2nd Edition-University Science Press (2017)

Leave a Reply

Your email address will not be published. Required fields are marked *