Relational algebra is a procedural query language. It uses a collection of operators to compose the queries. Every operator in the algebra accepts either one or two relation instances as arguments and output a resultant relation instance. Thus operators can be composed easily to construct complex queries. Each relational algebra query describes a step-by-step procedure for computing the desired answer, based on the order in which the operators are applied in the query.
The relational algebra uses various logical connectives [∧(and), v(or), ¬(not)] and comparison operators (<, <=, =, #, >=, >) to construct composite and more complex queries.
We discuss the different operators (Basic set oriented operators—union, intersection, difference, and cartesian product and relational operators—selection, projection, division and join) in detail, with examples, in subsequent sections. All the examples are based on the EMPLOYEE-STUDENT database shown in Figure 5.1. This database contain two Tables EMPLOYEE and STUDENT and the relationship is that an employee can also a student and vice versa.
1. Operations in Relational Algebra
The relational algebraic operations can be divided into basic set-oriented operations (union, intersection, set difference, and Cartesian product) and relational-oriented operations (selection, projection, division and joins).
1.1. Basic Set-oriented Operations
- The union operation : The union operation is a binary operation that is used to find union of Here relations are considered as sets. So, duplicate values are eliminated. It is denoted by (∪).
Conditions for union operation : There are two necessary conditions for union operation.
-
- Both the relations have same number of
- Data types of their corresponding attributes must be
Two relations are said to be union compatible if they follow the above two conditions. Ex. If you want to find the names of all employees and names of all students together then the query is
The result is shown in Figure 5.2.
- Set intersection operation : Set intersection is used to find common tuples between two It is denoted by (∩). If you want to find all the employees from Relation Employee those are also students. Rules of set union operations are also applicable here. Then the query, is
The result is shown in Figure 5.3.
- Set-difference operation : Set-difference operation is a binary operation which is used to find tuples that are present in one relation but not in other It is denoted by (—). It removes the common tuples of two relations and produce a new relation having rest of the tupels of first relation.
Ex. If you want the names of those employees that are not students, then the query, is
The result is shown in Figure 5.4.
- Cartesian product operation : Cartesian product is a binary operation which is used to combine information of any two Suppose a relation R1 is having m tuples and other relation R2 is having n tuples then R1 × R2 has m × n tuples. It is denoted by (X). Consider the Figure 5.5. Cartesian product of relation Employee and Job is shown in Figure 5.6.
Query is → Employee × Job
1.2. Relational Oriented Operation
- Selection or Restriction operation : The selection operation is a unary This is used to find horizontal subset of relation or tuples of relation. It is denoted by sigma (σ).
Ex. If you want all the employees having salary more than 9,000 from relation Employee. The query, is
The result is shown in Figure 5.7.
We can also combine these operations. If you want name of all employees having salary less than 7,000. Then the query is
Step 1. First we apply projection operation on relation employee to get name of all employees.
Step 2. Then we apply selection operation to choose employees having salary less than 7,000.
The result is shown in Figure 5.8.
- Projection operation : The projection operation is a unary operation which applies only on a single relation at a Project operation is used to select vertical subset of relation (i.e., columns of table). It is denoted by pi (π).
Consider Figure 5.1. If you want all the names of employees and their salary from relation employee. Then query, is
The result is shown in Figure 5.9.
- Division operation : Division operation is useful in special kind of queries that include the phrase “for all”. It is denoted by (÷). It is like the inverse of Cartesian product.
For example :
Let R be the relation with schema r and S be the relation with schema s. Let S ⊆ R. Then tuple t is in r ÷ s if and only if
- t is in
- If tuple is present in the Cartesian product of S and R.
- Natural-join operation : Natural join is used to join two relations having any number of It is denoted by symbol (). It also optimize the query as Cartesian product gives unnecessary results and set-union and set-intersection operations are applicable only on those relations that have equal number of attributes with same data-type. Consider the relations in Figure 5.11.
Ex. Find the names of all employees from relation employee with their respective department names. The query is
The result is shown in Figure 5.12.
- Outer join : Outer Join is an extension of natural join It deals with the missing information caused by natural join operation.
Suppose you need all information of all employees and all students in a single relation.
Natural join () : The natural join (Employee Student) gives the result as shown in Figure 5.13.
In this result, information about Ramesh, Jack, Vijay, Gaurav are missing. So, outer join is used to avoid this loss of information.
There are Three types of outer joins.
- Left outer join : It is used to take all tuples of relation that are on the left side whether they are matching with tuples of right side relation or It is denoted by ( ).
Employee relation is at left side so table in Figure 5.14 consists all information of Employee relation but still missing information about Vijay and Gaurav.
- Right outer join : It is used to take all tuples of relation that are on the right side whether they are matching with tuples of left side relation or It is denoted by ( ).
Student relation is at right side so table in Figure 5.15 consists all information of Student relation but still missing information about Ramesh and Jack.
- Full outer join : It is used to take all tuples from left and right relation whether they match with each other or did not It is denoted by ( ).
(Employee Student) gives
Table in Figure 5.16 consist all information of Employee and Student relation. Here, no information is missing.
Points to Remember
- Perform projection and restriction as soon as
- Convert two or more operations into single operation if
- While joining two tables put small table on
- At run time, you will see a blank space instead of For better understanding use “NULL”.
Example. Given two relations R1 and R2, where R1 contains N1 tuples, R2 contains N2 tuples and N2 > N1 > 0. Determine the maximum and minimum number of tuples produced by each of the following relational algebra expressions. Also state any assumption about R1 and R2 for making the expression meaningful.
Sol. The following table gives the assumptions and corresponding minimum and maximum tuples produced by each relational algebra expression.
1.3. Extended Relational-Algebra Operations
In addition to the fundamental operations, there are some additional operations to make relational algebra more flexible.
Extended Relational-Algebra-Operations
- Generalized Projection: Extends the projection operation by allowing arithmetic functions to be used in the projection
E is any relational-algebra expression
Each of F1, F2, …, Fn are arithmetic expressions involving constants and attributes in the schema of E.
- Aggregate Functions and Operations
Aggregate or set functions are introduced to relational algebra to increase its expressive power. An aggregate function operates on a set of values (tuples) and computes one single value as output. These functions take a collection of values and return a single value as a result. The aggregate operation in relational algebra is defined as
E is any relational-algebra expression
G1, G2, …, .Gn is a list of attributes on which to group (can be empty)
Each Fi is an aggregate function
Each Ai is an attribute name
𝐹 denotes the character script-F
The various aggregate functions are
- Avg (average value) : computes the average of all values in the (numeric) set
- Min(minimum value): finds the minimum value of all values in the set
- Max(maximum value): finds the maximum value of all values in the set
- Sum (sum of values): computes the sum of all values in the (numeric) set
- Count (number of values): returns the cardinality (number of elements) in the set
The result of aggregation does not have a name. You can use rename operation to give it a name.
Example. Consider the relation r as shown below:
- Rename operation : The rename operation is a unary operation which is used to give names to relational algebra It is denoted by rho (ρ). Suppose, you want to find Cartesian product of a relation with itself then by using rename operator we give an alias name to that relation. Now, you can easily multiply that relation with its alias. It is helpful in removing ambiguity. Its general form is
where R be any relation and x be the alias name given to relation R.
Ex. Find the highest salary in Employee relation.
To do this, first make a temporary relation that contains all salaries except highest one. Then take the difference of temporary relation with employee relation.
Query is
Step 1. Take alias name of relation Employee by using rename operator
Here alias name is a
Step 2. Take Cartesian product of relation Employee with its alias. This gives all combinations.
Step 3. Now, choose the salary in a way that temporary relation contains all salaries except highest one.
Step 4. Take set-difference of temporary relation with relation employee.
The result is shown in Figure 5.17.
The second form of rename operator : You can also rename the attributes of any relation.
where A1, A2, …, An are new names for attributes of relation E.
Consider the relation student in Figure 5.1.
The result is shown in Fig. 5.18.
- Assignment operation : Assignment operation is used to assign temporary names to relational algebra It is useful in large expressions. It is denoted by (←).
Consider an expression.
Now, you can write above expression as
Source: Gupta Satinder Bal, Mittal Aditya (2017), Introduction to Basic Database Management System, 2nd Edition-University Science Press (2017)