SQL Subqueries in the HAVING Clause *

Although subqueries are most often found in the WHERE clause, they can also be used in the HAVING clause of a query. When a subquery appears in the HAVING clause, it works as part of the row group selection performed by the HAVING clause. Consider this query with a subquery:

List the salespeople whose average order size for products manufactured by ACI is higher than overall average order size.

SELECT NAME, AVG(AMOUNT)

FROM SALESREPS, ORDERS

WHERE EMPL_NUM = REP

AND MFR = ‘ACI’

GROUP BY NAME

HAVING AVG(AMOUNT) > (SELECT AVG(AMOUNT)

FROM ORDERS)

NAME        AVG(AMOUNT)

———– ————

Sue Smith   $15,000.00

Tom Snyder  $22,500.00

Figure 9-7 shows conceptually how this query works. The subquery calculates the overall average order size. It is a simple subquery and contains no outer references, so SQL can calculate the average once and then use it repeatedly in the HAVING clause. The main query goes through the ORDERS table, finding all orders for ACI products, and groups them by salesperson. The HAVING clause then checks each row group to see whether the average order size in that group is bigger than the average for all orders, calculated earlier. If so, the row group is retained; if not, the row group is discarded. Finally, the SELECT clause produces one summary row for each group, showing the name of the salesperson and the average order size for each.

You can also use a correlated subquery in the HAVING clause. Because the subquery is evaluated once for each row group, however, all outer references in the correlated subquery must be single-valued for each row group. Effectively, this means that the outer reference must either be a reference to a grouping column of the outer query or be contained within a column function. In the latter case, the value of the column function for the row group being tested is calculated as part of the subquery processing.

If the previous request is changed slightly, the subquery in the HAVING clause becomes a correlated subquery:

List the salespeople whose average order size for products manufactured by ACI is at least as big as that salesperson’s overall average order size.

 SELECT NAME, AVG(AMOUNT)

FROM SALESREPS, ORDERS

WHERE EMPL_NUM = REP AND MFR = ‘ACI’

GROUP BY NAME, EMPL_NUM HAVING AVG(AMOUNT) >= (SELECT AVG(AMOUNT)

FROM ORDERS

WHERE REP = EMPL_NUM)

NAME        AVG(AMOUNT)

———– ————

Bill Adams   $7,865.40

Sue Smith   $15,000.00

Tom Snyder  $22,500.00

In this new example, the subquery must produce the overall average order size for the salesperson whose row group is currently being tested by the HAVING clause. The subquery selects orders for that particular salesperson, using the outer reference EMPL_NUM. The outer reference is legal because EMPL_NUM has the same value in all rows of a group produced by the main query.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *