# Relational algebra

Relational algebra is a procedural query language. It uses a collection of operators to compose the queries. Every operator in the algebra accepts either one or two relation instances as arguments and output a resultant relation instance. Thus operators can be composed easily to construct complex queries. Each relational algebra query describes a step-by-step procedure for computing the desired answer, based on the order in which the operators are applied in the  query.

The relational algebra uses various logical connectives [∧(and), v(or), ¬(not)] and comparison operators (<, <=, =, #, >=, >) to construct composite and more complex queries.

We discuss the different operators (Basic set oriented operators—union, intersection, difference, and cartesian product and relational operators—selection, projection, division and join) in detail, with examples, in subsequent sections. All the examples are based on the EMPLOYEE-STUDENT database shown in Figure 5.1. This database contain two Tables EMPLOYEE and STUDENT and the relationship is that an employee can also a student and vice versa.

### 1. Operations in Relational Algebra

The relational algebraic operations can be divided into basic set-oriented operations (union, intersection, set difference, and Cartesian product) and relational-oriented operations (selection, projection, division  and  joins).

#### 1.1. Basic Set-oriented Operations

• The union operation : The union operation is a binary operation that is used to find union of Here relations are considered as sets. So, duplicate values are eliminated. It  is  denoted  by  (∪).

Conditions for union operation : There are two necessary conditions for union operation.

1. Both the relations have same  number of
2. Data types of their  corresponding attributes  must be

Two relations are said to be union compatible if they follow the above two conditions. Ex. If you want to find the names of all employees and names of all students together then the query is

The result  is  shown in  Figure  5.2.

• Set intersection operation : Set intersection is used to find common tuples between two It is denoted by (∩). If you want to find all the employees from Relation Employee those are also students. Rules of set union operations are also applicable here. Then the query,  is

The result  is  shown in  Figure  5.3.

• Set-difference operation : Set-difference operation is a binary operation which is used to find tuples that are present in one relation but not in other It is denoted by (—). It removes the common tuples of two relations and produce a new relation having rest of  the tupels of  first  relation.

Ex. If you want the names of those employees that are not students, then the query, is

The result  is  shown in  Figure  5.4.

• Cartesian product operation : Cartesian product is a binary operation which is used to combine information of any two Suppose a relation R1 is having m tuples and other relation R2 is having n tuples then R1 × R2 has m × n tuples. It is denoted by (X). Consider the Figure 5.5. Cartesian product of relation Employee and Job is  shown  in  Figure  5.6.

Query is → Employee × Job

#### 1.2.  Relational Oriented Operation

• Selection or Restriction operation : The selection operation is a unary This is used to find horizontal subset of relation or tuples of relation. It is denoted by sigma (σ).

Ex. If you want all the employees having salary more than 9,000 from relation Employee. The  query, is

The result is shown in Figure 5.7.

We can also combine these operations. If you want name of all employees having salary less than 7,000. Then the query is

Step 1. First we apply projection operation on relation employee to get name of all employees.

Step 2. Then we apply selection operation to choose employees having salary less than 7,000.

The result is shown in Figure 5.8.

• Projection operation : The projection operation is a unary operation which applies only on a single relation at a Project operation is used to select vertical subset of relation (i.e.,  columns  of  table).  It  is  denoted by  pi  (π).

Consider Figure 5.1. If you want all the names of employees and their salary from relation employee. Then  query, is

The result is shown in Figure 5.9.

• Division operation : Division operation is useful in special kind of queries that include the phrase “for all”. It is denoted by (÷). It is like the inverse of Cartesian product.

For example :

Let R be the relation with schema r and S be the relation with schema s. Let S ⊆ R. Then tuple t is in r ÷ s if and only if

• t is in
• If tuple is present in the Cartesian  product of S  and R.
• Natural-join operation : Natural join is used to join two relations having any number of It is denoted by symbol (). It also optimize the query as Cartesian product gives unnecessary results and set-union and set-intersection operations are applicable only on those relations that have equal number of attributes with same data-type. Consider the relations in Figure 5.11.

Ex. Find the names of all employees from relation employee with their respective department names. The query is

The result is shown in Figure 5.12.

• Outer join : Outer Join is an extension of natural join It deals with the missing information caused by natural join operation.

Suppose you need all information of all employees and all students in a single relation.

Natural join () : The natural join (Employee Student) gives the result as shown in Figure  5.13.

In this result, information about Ramesh, Jack, Vijay, Gaurav are missing. So, outer join is used to avoid this loss of information.

There are Three types of outer joins.

• Left outer join : It is used to take all tuples of relation that are on the left side whether they are matching with tuples of right side relation or It is denoted by ( ).

Employee relation is at left side so table in Figure 5.14 consists all information of Employee relation but still missing information about Vijay and Gaurav.

• Right outer join : It is used to take all tuples of relation that are on the right side whether they are matching with tuples of left side relation or It is denoted by ( ).

Student relation is at right side so table in Figure 5.15 consists all information of Student  relation  but  still missing  information  about  Ramesh and  Jack.

• Full outer join : It is used to take all tuples from left and right relation whether they match with  each  other  or  did  not    It  is  denoted  by  (  ).

(Employee    Student)    gives

Table in Figure 5.16 consist all information of Employee and Student relation. Here, no  information  is  missing.

Points to Remember

1. Perform projection and restriction as soon as
2. Convert two or more  operations into  single  operation if
3. While joining two tables put small table on
4. At run time, you will see a blank space instead of For better understanding use “NULL”.

Example. Given two relations R1 and R2, where R1 contains N1 tuples, R2 contains N2 tuples and N2 > N1 > 0. Determine the maximum and minimum number of tuples produced by each of the following relational algebra expressions. Also state any assumption about R1 and R2  for  making  the  expression  meaningful.

Sol. The following table gives the assumptions and corresponding minimum and maximum tuples produced  by  each relational  algebra  expression.

#### 1.3.  Extended Relational-Algebra Operations

In addition to the fundamental operations, there are some additional operations to make relational algebra  more  flexible.

Extended Relational-Algebra-Operations

• Generalized Projection: Extends the projection operation by allowing arithmetic functions to be  used  in  the  projection

E is any relational-algebra expression

Each of F1, F2, …, Fn are arithmetic expressions involving constants and attributes in the schema of E.

•  Aggregate Functions and Operations

Aggregate or set functions are introduced to relational algebra to increase its expressive power. An aggregate function operates on a set of values (tuples) and computes one single value as output. These functions take a collection of values and return a single value as a result. The aggregate operation  in relational algebra is defined as

E is any relational-algebra expression

G1, G2, …, .Gn is a list of attributes on which to group (can be empty)

Each Fi is  an  aggregate  function

Each Ai is an attribute name

𝐹 denotes the character script-F

The various aggregate functions are

• Avg (average value) : computes the average of all values in the (numeric) set
• Min(minimum value): finds  the minimum  value  of all  values  in the  set
• Max(maximum value): finds  the maximum  value  of all  values  in the  set
• Sum (sum of values): computes the sum of all values in the (numeric) set
• Count (number of values): returns the cardinality (number of elements) in the set

The result of aggregation does not have a name. You can use rename operation to give it a name.

Example. Consider  the relation  as shown  below:

• Rename operation : The rename operation is a unary operation which is used to give names to relational algebra It is denoted by rho (ρ). Suppose, you want to find Cartesian product of a relation with itself then by using rename operator we give an alias name to that relation. Now, you can easily multiply that relation with its alias. It is helpful in removing ambiguity. Its general form is

where R be any relation and  x be the alias name given  to relation R.

Ex. Find the  highest salary  in Employee  relation.

To do this, first make a temporary relation that contains all salaries except highest one. Then take the difference of temporary relation with employee relation.

Query is

Step 1. Take alias name of relation Employee by using rename operator

Here alias name is a

Step 2. Take Cartesian product of relation Employee with its alias. This gives all combinations.

Step 3. Now, choose the salary in a way that temporary relation contains all salaries except highest one.

Step 4. Take set-difference of temporary relation with relation employee.

The result is shown in Figure 5.17.

The second form of rename operator : You can also rename the attributes of any relation.

where A1, A2, …, An are new names for attributes of relation E.

Consider the  relation  student in  Figure  5.1.

The result is shown in Fig. 5.18.

• Assignment operation : Assignment operation is used to assign temporary names to relational algebra It is useful in large expressions. It is denoted by (←).

Consider an expression.

Now, you can write above expression as

Source: Gupta Satinder Bal, Mittal Aditya (2017), Introduction to Basic Database Management System, 2nd Edition-University Science Press (2017)