Heuristic query optimization

Heuristic query optimization technique is used to modify the internal representation of a query by using heuristic rules and transformation rules. Heuristic rules are used in the form of a query tree or query graph structure. Optimiser starts with initial query tree and transform it  into an  equivalent and  efficient  query tree  using transformation  rules.

Heuristic Optimization Algorithm : DBMS use heuristic optimization algorithms to improve the efficiency of query by converting initial query tree into an equivalent and optimized query tree. Optimizers utilize transformation rules to optimize the structure of query tree. Following are  the  steps  of  heuristic  optimization  algorithm.

  • Step 1. Perform Selection operation as early as possible : By using selection operation at early stages, you can reduce the unwanted number of record or data, to transfer from database to  primary memory.

Optimizer use transformation rule 1 to divide selection operations with conjunctive conditions into a cascade of selection operations.

  • Step 2. Perform commutativity of selection operation with other operations   as   early   as possible : Optimizer use transformation rule 2, 4, 6, and 9 to move selection operation as far down the tree as possible and keep selection predicates on the same relation together. By keeping selection operation down at tree reduces the unwanted data transfer and by keeping selection predicates together on same relations reduces the number of times of database manipulation to retrieve records from  same  database table.
  • Step 3. Combine the Cartesian Product with subsequent selection operation whose predicates represents a join condition into a JOIN operation : Optimizer uses transformation rule 13 to convert a selection and cartesian product sequence into join. It reduces data transfer. It is always better to transfer only required data from database instead of transferring whole data and then refine it. (Cartesian product combines all data of all the tables mention in query while join operation retrieves only those records  from database  that satisfy  the join  condition).
  • Step 4. Use Commutativity and Associativity of Binary operations : Optimizer use transformation rules 5, 11, and 12 to execute the most restrictive selection operations first.

It rearranges the leaf nodes of query tree. By using the most restrictive selection operations, the number of records fetched from database reduces and also subsequent operations  can be  performed on  less  number of  records.

  • Step 5. Perform projection operations as early as possible : After performing selection operations, optimizer use transformation rules 3, 4, 7 and 10 to reduce the number of columns of a relation by moving projection operations as far down the tree as possible and keeping projection predicates on the same relation together.
  • Step 6. Compute common expressions only once: It is used to identify sub-trees that represent groups of  operations  that  can  be executed  by  a  single  algorithm.

Consider the query below in SQL and transformation of its initial query tree into an optimal query  tree.

SELECT Emp_Name

FROM Employee e, Department d, Project p

WHERE p.Name = ‘LUXMI PUB.’

AND d.Proj_ID = p.Proj_ID

AND e.Dept_ID= d.Dept_ID

AND e.Age > 35

This query needs to display names of all employees working for project “LUXMI PUB.” and having  age  more  than  35  years.

Figure 11.10 shows the initial query tree for the given SQL query. If the tree is executed directly then it results in the Cartesian product of entire Employee, Department, and Project table but in reality, the query needed only one record from relation Project and only the employee records  for  those whose  age  is greater  than  35 years.

  • We can improve the performance by first applying selection operations to reduce the number of records that appear in Casterian Figure 11.11 shows the improved query tree.

  • The query tree can be further improved by applying more restrictive selection operation. So, switch the positions of relations Project and Employee as you know that in a single project it may be more than one Figure 11.12 shows the improved query tree.

  • A further improvement can be done by replacing Cartesian product operations by Join operations with a  join condition  as  shown in  Figure 13.

  • Further improvement can be done in query tree by keeping only required attributes (columns) of relations by applying projection operations as early as possible in the query Optimizer keep the attributes required to   display,   and   the   attributes   needed   by the subsequent operation in the intermediate relations. Modified query tree is shown in Figure 11.14.

Note: The SELECT clause in SQL is equivalent to projection operation and WHERE clause in SQL is equivalent to  selection  operation  in  relational  algebra.

Source: Gupta Satinder Bal, Mittal Aditya (2017), Introduction to Basic Database Management System, 2nd Edition-University Science Press (2017)

Leave a Reply

Your email address will not be published. Required fields are marked *