Simple Queries in SQL: Search Conditions

SQL offers a rich set of search conditions that allow you to specify many different kinds of queries efficiently and naturally. Five basic search conditions (called predicates in the ANSI/ISO standard) are summarized here and are described in the sections that follow:

  • Comparison test. Compares the value of one expression to the value of another expression. Use this test to select offices in the Eastern region, or salespeople whose sales are above their quotas.
  • Range test. Tests whether the value of an expression falls within a specified range of values. Use this test to find salespeople whose sales are between $100,000 and $500,000.
  • Set membership test. Checks whether the value of an expression matches one of a set of values. Use this test to select offices located in New York, Chicago, or Los Angeles.
  • Pattern matching test. Checks whether the value of a column containing string data matches a specified pattern. Use this test to select customers whose names start with the letter
  • Null value test. Checks whether a column has a NULL (unknown) value. Use this test to find the salespeople who have not yet been assigned to a manager.

1. The Comparison Test (=, <>, <, <=, >, >=)

The most common search condition used in a SQL query is a comparison test. In a comparison test, SQL computes and compares the values of two SQL expressions for each row of data. The expressions can be as simple as a column name or a constant, or they can be more complex arithmetic expressions. SQL offers six different ways of comparing the two expressions, as shown in Figure 6-7.

Next are some examples of typical comparison tests.

Find salespeople hired before 1988

SELECT NAME

FROM SALESREPS

WHERE HIRE_DATE < ’01-JAN-88

NAME

————

Sue Smith

Bob Smith

Dan Roberts

Paul Cruz

List the offices whose sales fall below 80 percent of target.

SELECT CITY, SALES, TARGET FROM OFFICES

WHERE SALES < (.8 * TARGET)

CITY      SALES           TARGET

———————————–

Denver   $186,042.00    $300,000.00

List the offices not managed by employee number 108.

SELECT CITY, MGR

FROM OFFICES

WHERE MGR <> 108

CITY        MGR 

——————-

New York    106

Chicago     104

Atlanta     105

As shown in Figure 6-7, the inequality comparison test is written as “A < > B” according to the ANSI/ISO SQL specification. Several SQL implementations use alternate notations, such as “A != B” (used by SQL Server) and “A0=B” (used by DB2 and SQL/DS). In some cases, these are alternative forms; in others, they are the only acceptable form of the inequality test.

When SQL compares the values of the two expressions in the comparison test, three results can occur:

  • If the comparison is true, the test yields a TRUE
  • If the comparison is false, the test yields a FALSE
  • If either of the two expressions produces a NULL value, the comparison yields a NULL

2. Single-Row Retrieval

The most common comparison test is one that checks whether a column’s value is equal to some constant. When the column is a primary key, the test isolates a single row of the table, producing a single row of query results, as in this example:

Retrieve the name and credit limit of customer number 2107.

SELECT COMPANY, CREDIT_LIMIT FROM CUSTOMERS WHERE CUST_NUM = 2107

COMPANY                 CREDIT LIMIT

————————————

Ace International        $35,000.00

This type of query is the foundation of forms-based database retrieval programs. The user enters a customer number into the form, and the program uses the number to construct and execute a query. It then displays the retrieved data in the form.

Note that the SQL statements for retrieving a specific customer by number, as in this example, and retrieving all customers with a certain characteristic (such as those with credit limits over $25,000) both have exactly the same form. These two types of queries (retrieval by primary key and retrieval based on a search of the data) would be very different operations in a nonrelational database. This uniformity of approach makes SQL much simpler to learn and use than earlier query languages.

3. Null Value Considerations

The behavior of NULL values in comparison tests can reveal some “obviously true” notions about SQL queries to be, in fact, not necessarily true. For example, it would seem that the results of these two queries would include every row of the SALESREPS table:

List salespeople who are over quota.

SELECT NAME

FROM SALESREPS

WHERE SALES > QUOTA

NAME

————

Bill Adams

Mary Jones

Sue Smith

Sam Clark

Dan Roberts

Larry Fitch

Paul Cruz

List salespeople who are under or at quota.

SELECT NAME

FROM SALESREPS

WHERE SALES < = QUOTA

NAME

———-

Bob Smith

Nancy Angelli

However, the queries produce seven and two rows, respectively, for a total of nine rows, while there are ten rows in the SALESREPS table. Tom Snyder’s row has a NULL value in the QUOTA column because he has not yet been assigned a quota. This row is not listed by either query; it “vanishes” in the comparison test.

As this example shows, you need to think about NULL value handling when you specify a search condition. In SQL’s three-valued logic, a search condition can yield a TRUE, FALSE, or NULL result. Only rows where the search condition yields a TRUE result are included in the query results.

4. The Range Test (between)

SQL provides a different form of search condition with the range test (BETWEEN) shown in Figure 6-8. The range test checks whether a data value lies between two specified values. It involves three SQL expressions. The first expression defines the value to be tested; the second and third expressions define the low and high ends of the range to be checked. The data types of the three expressions must be comparable. This example shows a typical range test:

Find orders placed in the last quarter of 1989.

The BETWEEN test includes the endpoints of the range, so orders placed on October 1 or December 31 are included in the query results. Here is another example of a range test:

Find the orders that fall into various amount ranges.

SELECT ORDER_NUM, AMOUNT

FROM ORDERS

WHERE AMOUNT BETWEEN 20000.00 AND 29999.99

ORDER NUM   AMOUNT

——————-

113036    $22,500.00

112987    $27,500.00

113042    $22,500.00

 

SELECT ORDER_NUM, AMOUNT

FROM ORDERS

WHERE AMOUNT BETWEEN 30000.00 AND 39999.99

ORDER NUM   AMOUNT

——————-

112961    $31,500.00

113069    $31,350.00

 

SELECT ORDER_NUM, AMOUNT

FROM ORDERS

WHERE AMOUNT BETWEEN 40000.00 AND 49999.99

ORDER NUM   AMOUNT

——————-

113045    $45,000.00

The negated version of the range test (NOT BETWEEN) checks for values that fall outside the range, as in this example:

List salespeople whose sales are not between 80 percent and 120 percent of quota.

SELECT NAME, SALES, QUOTA

FROM SALESREPS

WHERE SALES NOT BETWEEN (.8 * QUOTA) AND (1.2 * QUOTA)

The test expression specified in the BETWEEN test can be any valid SQL expression, but in practice, it’s usually just a column name, as in the previous examples.

The ANSI/ISO standard defines relatively complex rules for the handling of NULL values in the BETWEEN test:

  • If the test expression produces a NULL value, or if both expressions defining the range produce NULL values, then the BETWEEN test returns a NULL
  • If the expression defining the lower end of the range produces a NULL value, then the BETWEEN test returns FALSE if the test value is greater than the upper bound, and NULL
  • If the expression defining the upper end of the range produces a NULL value, then the BETWEEN test returns FALSE if the test value is less than the lower bound, and NULL

Before relying on this behavior, it’s a good idea to experiment with your DBMS.

It’s worth noting that the BETWEEN test doesn’t really add to the expressive power of SQL, because it can be expressed as two comparison tests. The range test:

A BETWEEN B AND C

is completely equivalent to:

(A >= B) AND (A < = C)

However, the BETWEEN test is a simpler way to express a search condition when you’re thinking of it in terms of a range of values.

5. The Set Membership Test (in)

Another common search condition is the set membership test (IN), shown in Figure 6-9. It tests whether a data value matches one of a list of target values. Here are several queries that use the set membership test:

List the salespeople who work in New York, Atlanta, or Denver.

Find all orders placed on a Thursday in January 1990.

Find all orders placed with four specific salespeople.

You can check whether the data value does not match any of the target values by using the NOT IN form of the set membership test. The test expression in an IN test can be any SQL expression, but it’s usually just a column name, as in the preceding examples. If the test expression produces a NULL value, the IN test returns NULL. All of the items in the list of target values must have the same data type, and that type must be comparable to the data type of the test expression.

Like the BETWEEN test, the IN test doesn’t add to the expressive power of SQL, because the search condition:

X IN (A, B, C)

is completely equivalent to:

(X = A) OR (X = B) OR (X = C)

However, the IN test offers a much more efficient way of expressing the search condition, especially if the set contains more than a few values. The ANSI/ISO SQL standard doesn’t specify a maximum limit to the number of items that can appear in the value list, and most commercial implementations do not state an explicit upper limit either. For portability reasons, it’s generally a good idea to avoid lists with only a single item, such as this one:

CITY IN (‘New York’)

and replace them with a simple comparison test:

CITY = ‘New York

6. The Pattern Matching Test (like)

You can use a simple comparison test to retrieve rows where the contents of a text column match some particular text. For example, this query retrieves a row of the CUSTOMERS table by name:

Show the credit limit for Smithson Corp.

SELECT COMPANY, CREDIT_LIMIT

FROM CUSTOMERS

WHERE COMPANY = ‘Smithson Corp.’

However, you might easily forget whether the company’s name was “Smith,” “Smithson,” or “Smithsonian.” You can use SQL’s pattern matching test to retrieve the data based on a partial match of the customer’s name.

The pattern matching test (LIKE), shown in Figure 6-10, checks to see whether the data value in a column matches a specified pattern. The pattern is a string that may include one or more wildcard characters. These characters are interpreted in a special way.

Wildcard Characters

The percent sign (%) wildcard character matches any sequence of zero or more characters. Here’s a modified version of the previous query that uses the percent sign for pattern matching:

 SELECT COMPANY, CREDIT_LIMIT

FROM CUSTOMERS

WHERE COMPANY LIKE ‘Smith% Corp.’

The LIKE keyword tells SQL to compare the NAME column to the pattern “Smith% Corp. ” Any of the following names would match the pattern:

Smith Corp. Smithson Corp. Smithsen Corp. Smithsonian Corp.

but these names would not:

SmithCorp Smithson Inc.

The underscore (_) wildcard character matches any single character. If you are sure that the company’s name is either “Smithson” or “Smithsen,” for example, you can use this query:

 SELECT COMPANY, CREDIT_LIMIT

FROM CUSTOMERS

WHERE COMPANY LIKE ‘Smiths_n Corp.’

In this case, any of these names will match the pattern:

Smithson Corp. Smithsen Corp. Smithsun Corp.

but these names will not:

Smithsoon Corp. Smithsn Corp.

Wildcard characters can appear anywhere in the pattern string, and several wildcard characters can be within a single string. This query allows either the “Smithson” or “Smithsen” spelling and will also accept “Corp.,” “Inc.,” or any other ending on the company name:

 SELECT COMPANY, CREDIT_LIMIT

FROM CUSTOMERS

WHERE COMPANY LIKE ‘Smiths_n %’

You can locate strings that do not match a pattern by using the NOT LIKE form of the pattern matching test. The LIKE test must be applied to a column with a string data type. If the data value in the column is NULL, the LIKE test returns a NULL result.

If you have used computers through a command-line interface (such as the UNIX shell), you’ve probably seen string pattern matching before. Frequently, the asterisk (*) is used instead of SQL’s percent sign (%), and the question mark (?) is used instead of SQL’s underscore (_), but the pattern matching capabilities themselves are similar in most situations where a computer application offers the capability to match selected parts of a word or text.

Escape Characters *

One of the problems with string pattern matching is how to match the wildcard characters themselves as literal characters. To test for the presence of a percent sign character in a column of text data, for example, you can’t simply include the percent sign in the pattern because SQL will treat it as a wildcard. With some popular SQL products, you cannot literally match the two wildcard characters. This usually doesn’t pose serious problems, because the wildcard characters don’t frequently appear in names, product numbers, and other text data of the sort that is usually stored in a database.

The ANSI/ISO SQL standard does specify a way to literally match wildcard characters, using a special escape character. When the escape character appears in the pattern, the character immediately following it is treated as a literal character rather than as a wildcard character. (The latter character is said to be escaped.) The escaped character can be either of the two wildcard characters, or the escape character itself, which has now taken on a special meaning within the pattern.

The escape character is specified as a one-character constant string in the ESCAPE clause of the search condition, as shown in Figure 6-10. Here is an example using a dollar sign ($) as the escape character:

Find products whose product IDs start with the four letters “A%BC”.

 SELECT ORDER_NUM, PRODUCT

FROM ORDERS

WHERE PRODUCT LIKE ‘A$%BC%’ ESCAPE ‘$’

The first percent sign in the pattern, which follows an escape character, is treated as a literal percent sign; the second functions as a wildcard.

The use of escape characters is very common in pattern matching applications, which is why the ANSI/ISO standard specified it. However, it was not a part of the early SQL implementations and has been slowly adopted. To insure portability, the ESCAPE clause should be avoided.

7. The Null Value Test (is null)

NULL values create a three-valued logic for SQL search conditions. For any given row, the result of a search condition may be TRUE or FALSE, or it may be NULL because one of the columns used in evaluating the search condition contains a NULL value. Sometimes it’s useful to check explicitly for NULL values in a search condition and handle them directly. SQL provides a special NULL value test (IS NULL), shown in Figure 6-11, to handle this task.

This query uses the NULL value test to find the salesperson in the sample database who has not yet been assigned to an office:

Find the salesperson not yet assigned to an office.

SELECT NAME

FROM SALESREPS

WHERE REP_OFFICE IS NULL

NAME

——

Tom Snyder

The negated form of the NULL value test (IS NOT NULL) finds rows that do not contain a NULL value:

List the salespeople who have been assigned to an office.

SELECT NAME

FROM SALESREPS

WHERE REP_OFFICE IS NOT NULL

NAME

———

Bill Adams

Mary Jones

Sue Smith

Sam Clark

Bob Smith

Dan Roberts

Larry Fitch

Paul Cruz

Nancy Angelli

Unlike the previously described search conditions, the NULL value test cannot yield a NULL result. It is always either TRUE or FALSE.

It may seem strange that you can’t just test for a NULL value using a simple comparison search condition, such as this:

 SELECT NAME

FROM SALESREPS

WHERE REP_OFFICE = NULL

The NULL keyword can’t be used here because it isn’t really a value; it’s just a signal that the value is unknown. Even if the comparison test:

REP_OFFICE = NULL

were legal, the rules for handling NULL values in comparisons would cause it to behave differently from what you might expect. When SQL encountered a row where the REP_ OFFICE column was NULL, the search condition would test:

NULL = NULL

Is the result TRUE or FALSE? Because the values on both sides of the equal sign are unknown, SQL can’t tell, so the rules of SQL logic say that the search condition itself must yield a NULL result. Because the search condition doesn’t produce a true result, the row is excluded from the query results—precisely the opposite of what you wanted to happen! As a result of the way SQL handles NULLs in comparisons, you must explicitly use the NULL value test to check for NULL values.

8. Compound Search Conditions (and, or, and not)

The simple search conditions described in the preceding sections return a value of TRUE, FALSE, or NULL when applied to a row of data. Using the rules of logic, you can combine these simple SQL search conditions to form more complex ones, as shown in Figure 6-12. Note that the search conditions combined with AND, OR, and NOT may themselves be compound search conditions.

The keyword OR is used to combine two search conditions when one or the other (or both) must be true:

Find salespeople who are under quota or with sales under $300,000.

You can also use the keyword AND to combine two search conditions that must both be true:

Find salespeople who are under quota and with sales under $300,000.

SELECT NAME, QUOTA, SALES FROM SALESREPS WHERE SALES < QUOTA AND SALES < 300000.00

NAME                           QUOTA           SALES

Bob Smith                $200,000.00         $142,594.00

Nancy Angelli            $300,000.00         $186,042.00

Finally, you can use the keyword NOT to select rows where a search condition is false:

Find all salespeople who are under quota, but whose sales are not under $150,000.

SELECT NAME, QUOTA, SALES

FROM SALESREPS

WHERE SALES < QUOTA

AND NOT SALES < 150000.00

NAME               QUOTA           SALES

———————————————

Nancy Angelli   $300,000.00      $186,042.00

Using the logical AND, OR, and NOT keywords and parentheses to group the search criteria, you can build very complex search criteria, such as the one in this query:

Find all salespeople who either: (a) work in Denver, New York, or Chicago; or (b) have no manager and were hired since June 1988; or (c) are over quota, but have sales of $600,000 or less.

SELECT NAME

FROM   SALESREPS

WHERE (REP_OFFICE IN (22, 11, 12))

OR    (MANAGER IS NULL AND HIRE_DATE >= ’01-JUN-88′)

OR    (SALES > QUOTA AND NOT SALES > 600000.00)

Exactly why you might want to see this particular list of names is a mystery, but the example does illustrate a reasonably complex query.

As with simple search conditions, NULL values influence the outcome of compound search conditions, and the results are subtle. In particular, the result of (NULL OR TRUE) is TRUE, not NULL, as you might expect. Tables 6-1, 6-2, and 6-3 specify truth tables for AND, OR, and NOT, respectively, and show the impact of NULL values.

When more than two search conditions are combined with AND, OR, and NOT, the ANSI/ISO standard specifies that NOT has the highest precedence, followed by AND and then OR. To ensure portability, it’s always a good idea to use parentheses and remove any possible ambiguity.

The SQL2 standard adds another logical search condition, the IS test, to the logic provided by AND, OR, and NOT. Figure 6-13 shows the syntax of the IS test, which checks to see whether the logical value of an expression or comparison test is TRUE, FALSE, or UNKNOWN (NULL).

For example, the IS test:

((SALES – QUOTA) > 10000.00) IS UNKNOWN

can be used to find rows where the comparison cannot be done because either SALES or QUOTA has a NULL value. Similarly, the IS test:

((SALES – QUOTA) > 10000.00) IS FALSE

will select rows where SALES are not significantly above QUOTA. As this example shows, the IS test doesn’t really add to the expressive power of SQL, since the test could just as easily have been written:

NOT ((SALES – QUOTA) > 10000.00)

For maximum portability, it’s a good idea to avoid the tests and write the expressions using only AND, OR, and NOT. It’s not always possible to avoid the IS UNKNOWN form of the test.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *