Business Intelligence and Transact-SQL: OLAP Query Functions

Transact-SQL supports two groups of functions that are categorized as OLAP query functions:

  • Ranking functions
  • Statistical aggregate functions

The following subsections describe these functions.

NOTE The GROUPING function, discussed previously, also belongs to the OLAP functions.

1. Ranking Functions

Ranking functions return a ranking value for each row in a partition group. Transact-SQL supports the following ranking functions:

  • RANK
  • DENSE_RANK
  • ROW_NUMBER

Example 24.15 shows the use of the RANK function.

Example 24.15

Find all departments with a budget not greater than 30000, and display the result set in descending order:

USE sample;

SELECT RANK() OVER(ORDER BY budget DESC) AS rank_budget,

dept_name, emp_cnt, budget

FROM project_dept

WHERE budget <= 30000;

The result is

Example 24.15 uses the RANK function to return a number (in the first column of the result set) that specifies the rank of the row among all rows. The example uses the OVER clause to sort the result set by the budget column in descending order. (In this example, the PARTITION BY clause is omitted. For this reason, the whole result set will belong to only one partition.)

NOTE The RANK function uses logical aggregation. In other words, if two or more rows in a result set are tied (have a same value in the ordering column), they will have the same rank. The row with the subsequent ordering will have a rank that is one plus the number of ranks that precede the row. For this reason, the RANK function displays “gaps” if two or more rows have the same ranking.

Example 24.16 shows the use of the two other ranking functions, DENSE_RANK and ROW_NUMBER.

Example 24.16

Find all departments with a budget not greater than 40000, and display the dense rank and the sequential number of each row in the result set:

USE sample;

SELECT DENSE_RANK() OVER( ORDER BY budget DESC ) AS dense_rank,

ROW_NUMBER() OVER( ORDER BY budget DESC ) AS row_number,

dept_name, emp_cnt, budget

FROM project_dept

WHERE budget <= 40000;

The result is

The first two columns in the result set of Example 24.16 show the values for the DENSE_ RANK and ROW_NUMBER functions, respectively. The output of the DENSE_RANK function is similar to the output of the RANK function (see Example 24.15). The only difference is that the DENSE_RANK function returns no “gaps” if two or more ranking values are equal and thus belong to the same ranking.

The use of the ROW_NUMBER function is obvious: it returns the sequential number of a row within a result set, starting at 1 for the first row.

In the previous two examples, the OVER clause is used to determine the ordering of the result set. As you already know, this clause can also be used to divide the result set produced by the FROM clause into groups (partitions), and then to apply an aggregate or ranking function to each partition separately.

Example 24.17 shows how the RANK function can be applied to partitions.

Example 24.17

Using the window construct, partition the rows of the project_dept table according to the values in the date_month column. Sort the rows in each partition and display them in ascending order.

USE sample;

SELECT date_month, dept_name, emp_cnt, budget,

RANK() OVER( PARTITION BY date_month ORDER BY emp_cnt desc ) AS rank

FROM project_dept;

The result is

The result set of Example 24.17 is divided (partitioned) into eight groups according to the values in the date_month column. After that the RANK function is applied to each partition.

2. Statistical Aggregate Functions

Chapter 6 introduced statistical aggregate functions. There are four of them:

  • VAR Computes the variance of all the values listed in a column or expression.
  • VARP Computes the variance for the population of all the values listed in a column or expression.
  • STDEV Computes the standard deviation of all the values listed in a column or expression. (The standard deviation is computed as the square root of the corresponding variance.)
  • STDEVP Computes the standard deviation for the population of all the values listed in a column or expression.

You can use statistical aggregate functions with or without the window construct. Example 24.18 shows how the functions VAR and STDEV can be used with the window construct.

Example 24.18

Using the window construct, calculate the variance and standard deviation of budgets in relation to partitions formed using the values of the dept_name column:

USE sample;

SELECT dept_name, budget,

VAR(budget) OVER(PARTITION BY dept_name) AS budget_var,

STDEV(budget) OVER(PARTITION BY dept_name) AS budget_stdev

FROM project_dept

WHERE dept_name in (‘Accounting’, ‘Research’);

Example 24.18 uses the statistical aggregate functions VAR and STDEV to calculate the variance and standard deviation of budgets in relation to partitions formed using the values of the dept_name column.

Source: Petkovic Dusan (2020), Microsoft SQL Server 2019: A Beginner’s Guide, Seventh Edition-McGraw-Hill Education.

Leave a Reply

Your email address will not be published. Required fields are marked *