The Preprocessor in C: The #define Statement

One of the primary uses of the #define statement is to assign symbolic names to pro- gram constants. The preprocessor statement

#define YES   1

defines the name YES and makes it equivalent to the value 1. The name YES can subse- quently be used anywhere in the program where the constant 1 could be used. Whenever this name appears, its defined value of 1 is automatically substituted into the program by the preprocessor. For example, you might have the following C statement that uses the defined name YES:

gameOver = YES;

This statement assigns the value of YES to gameOver.You don’t need to concern yourself with the actual value that you defined for YES, but because you do know that it is defined as 1, the preceding statement has the effect of assigning 1 to gameOver. The pre- processor statement

#define NO   0

defines the name NO and makes its subsequent use in the program equivalent to specify- ing the value 0. Therefore, the statement

gameOver = NO;

assigns the value of NO to gameOver, and the statement

if ( gameOver == NO )

compares the value of gameOver against the defined value of NO. Just about the only place that you cannot use a defined name is inside a character string; so the statement

char *charPtr = “YES”;

sets charPtr pointing to the string “YES” and not to the string “1”.

A defined name is not a variable. Therefore, you cannot assign a value to it, unless the result of substituting the defined value is in fact a variable. Whenever a defined name is used in a program, whatever appears to the right of the defined name in the #define statement gets automatically substituted into the program by the preprocessor.  It’s analo- gous to doing a search and replace with a text editor; in this case, the preprocessor replaces all occurrences of the defined name with its associated text.

Notice that the #define statement has a special syntax: There is no equal sign used to assign the value 1 to YES. Furthermore, a semicolon does not appear at the end of the statement. Soon, you will understand why this special syntax exists. But first, take a look at a small program that uses the YES and NO defines as previously  illustrated. The function isEven in Program 13.1 simply returns YES if its argument is even and NO if its argument is odd.

Program 13.1   Introducing  the #define Statement

#include <stdio.h>

#define YES 1
#define NO 0

// Function to determine if an integer is even

int isEven (int number)

{

int answer;

if ( number % 2 == 0 )

answer = YES;

else

answer = NO;

return answer;

}

int main (void)

{

int isEven (int number);

if ( isEven (17) == YES )

printf (“yes “);

else

printf (“no “);

if ( isEven (20) == YES )

printf (“yes\n”);

else

printf (“no\n”);

return 0;

}

Program 13.1   Output

no yes

The #define statements appear first in the program. This is not required; they can appear anywhere in the program. What is required is that a name be defined before it is refer- enced by the program. Defined names do not behave like variables: There is no such thing as a local define. After a name has been defined in a program, either inside or out- side a function, it can subsequently be used anywhere in the program. Most programmers group their defines at the beginning of the program (or inside an include file1) where they can be quickly referenced and shared by more than one source file.

The defined name NULL is frequently used by programmers to represent the null pointer.

By including a definition such as

#define NULL 0

in a program, you can then write more readable statements, such as

while ( listPtr != NULL )

to set up a while loop that will execute as long as the value of listPtr is not equal to the null pointer.

As another example of the use of a defined name, suppose you want to write three functions to find the area of a circle, the circumference of a circle, and the volume of a sphere of a given radius. Because all these functions need to use the constant p, which is not a particularly easy constant to remember, it makes sense to define the value of this constant once at the start of the program and then use this value where needed in each function.3

Program 13.2 shows how a definition for this constant can be set up and used in a program.

Program 13.2   More on Working with Defines

/* Function to calculate the area and circumference of a circle, and the volume of a sphere of a given radius */

#include <stdio.h>

#define PI      3.141592654

double area (double r)

{

return PI * r * r;

}

double circumference (double r)

{

return 2.0 * PI * r;

}

double volume (double r)

{

return 4.0 / 3.0 * PI * r * r * r;

}

int main (void)

{

double area (double r), circumference (double r),

volume (double r);

printf (“radius = 1: %.4f %.4f  %.4f\n”, area(1.0),

circumference(1.0), volume(1.0));

printf (“radius = 4.98: %.4f %.4f  %.4f\n”, area(4.98),

circumference(4.98), volume(4.98));

return 0;

}

Program 13.2   Output

radius = 1: 3.1416 6.2832 4.1888

radius = 4.98: 77.9128 31.2903 517.3403

The symbolic name PI is defined as the value 3.141592654 at the beginning of the pro- gram. Subsequent use of the name PI inside the area, circumference, and volume func- tions has the effect of causing its defined value to be automatically substituted at the appropriate point.

Assignment of a constant to a symbolic name frees you from having to remember the particular constant value every time you want to use it in a program. Furthermore, if you ever need to change the value of the constant (if, perhaps, you find out that you are using the wrong value, for example), you only have to change the value in one place in the program: in the #define statement. Without this approach, you would have to other- wise search throughout the program and explicitly change the value of the constant whenever it was used.

You might have realized that all the defines you have seen so far (YES, NO, NULL, and PI) have been written in capital letters. The reason this is done is to visually distinguish a defined value from a variable. Some programmers adopt the convention that all defined names be capitalized, so that it becomes easy to determine when a name represents a variable and when it represents a defined name. Another common convention is to pre- fix the define with the letter k. In that case, the following characters of the name are not capitalized. kMaximumValues and kSignificantDigits are two examples of defined names that adhere to this convention.

1. Program Extendability

Using a defined name for a constant value helps to make programs more readily extend- able. For example, when you define an array, you must specify the number of elements in the array—either explicitly or implicitly (by specifying a list of initializers). Subsequent program statements will likely use the knowledge of the number of elements contained inside the array. For example, if the array dataValues is defined in a program as follows:

float dataValues[1000];

there is a good chance that you will see statements in the program that use the fact that

dataValues contains 1,000 elements. For instance, in a for loop

for ( i = 0; i < 1000; ++i )

you would use the value 1000 as an upper bound for sequencing through the elements of the array. A statement such as

if ( index > 999 )

might also be used in the program to test if an index value exceeds the maximum size of the array.

Now suppose that you had to increase the size of the dataValues array from 1,000 to 2,000 elements. This would necessitate changing all statements that used the fact that dataValues contained 1,000 elements.

A better way of dealing with array bounds, which makes programs easier to extend, is to define a name for the upper array bound. So, if you define a name such as MAXIMUM_DATAVALUES with an appropriate #define statement:

#define MAXIMUM_DATAVALUES       1000

you can subsequently define the dataValues array to contain MAXIMUM_DATAVALUES ele- ments with the following program line:

float dataValues[MAXIMUM_DATAVALUES];

Statements that use the upper array bound can also make use of this defined name. To sequence through the elements in dataValues, for example, the for statement

for ( i = 0; i < MAXIMUM_DATAVALUES;  ++i )

could be used. To test if an index value is greater than the upper bound of the array, you could write

if ( index > MAXIMUM_DATAVALUES – 1 )

and so on. The nicest thing about the preceding approach is that you can now easily change the size of the dataValues array to 2,000 elements by simply changing the defi- nition:

#define MAXIMUM_DATAVALUES       2000

And if the program is written to use MAXIMUM_DATAVALUES in all cases where the size of the array was used, the preceding definition could be the only statement in the program that would have to be changed.

2. Program Portability

Another nice use of the define is that it helps to make programs more portable from one computer system to another. At times, it might be necessary to use constant values that are related to the particular computer on which the program is running. This might have to do with the use of a particular computer memory address, a filename, or the number of bits contained in a computer word, for example.You will recall that your rotate function from Program 12.4 used the knowledge that an int contained 32 bits on the machine on which the program was executed.

If you want to execute this program on a different machine, on which an int con- tained 64 bits, the rotate function would not work correctly.4 Study the following code. In situations in which the program must be written to make use of machine-dependent values, it makes sense to isolate such dependencies from the program as much as possible. The #define statement can help significantly in this respect. The new version of the rotate function would be easier to port to another machine, even though it is a rather simple case in point. Here’s the new function:

#include <stdio.h>

#define kIntSize 32 // *** machine dependent !!! ***

// Function to rotate an unsigned int left or right

unsigned int rotate (unsigned int value, int n)

{

unsigned int result, bits;

/* scale down the shift count to a defined range */

if ( n > 0 )

n = n % kIntSize;

else

n = -(-n % kIntSize);

if ( n == 0 )

result = value;

else if ( n > 0 )   /* left rotate */

{

bits = value >> (kIntSize – n);

result = value << n | bits;

}

else             /* right rotate */

{

n = -n;

bits = value << (kIntSize – n) ;

result = value >> n | bits;

}

return result;

}

3. More Advanced Types of Definitions

A definition for a name can include more than a simple constant value. It can include an expression, and, as you will see shortly, just about anything else!

The following defines the name TWO_PI as the product of 2.0 and 3.141592654:

#define TWO_PI  2.0 * 3.141592654

You can subsequently use this defined name anywhere in a program where the expres- sion 2.0 ´ 3.141592654 would be valid. So you could have replaced the return state- ment of the circumference function from the previous program with the following statement, for example:

return TWO_PI * r;

Whenever a defined name is encountered in a C program, everything that appears to the right of the defined name in the #define statement is literally substituted for the name at that point in the program. So, when the C preprocessor encounters the name TWO_PI in the return statement shown previously, it substitutes for this name whatever appeared in the #define statement for this name. Therefore, 2.0 x 3.141592654 is literally sub-stituted by the preprocessor whenever the defined name TWO_PI occurs in the program.

The fact that the preprocessor performs a literal text substitution whenever the defined name occurs explains why you don’t usually want to end your #define state- ment with a semicolon. If you did, then the semicolon would also be substituted into the program wherever the defined name appeared. If you had defined PI as

#define PI      3.141592654;

and then written

return 2.0 * PI * r;

the preprocessor would replace the occurrence of the defined name PI with 3.141592654;. The compiler would therefore see this statement as

return 2.0 * 3.141592654; * r;

after the preprocessor had made its substitution, which would result in a syntax error.

A preprocessor definition does not have to be a valid C expression in its own right— just so long as wherever it is used the resulting expression is valid. For instance, the defi- nition

#define LEFT_SHIFT_8 << 8

is legitimate, even though what appears after LEFT_SHIFT_8 is not a syntactically valid expression.You can use your definition of LEFT_SHIFT_8 in a statement such as

x = y LEFT_SHIFT_8;

to shift the contents of y to the left eight bits and assign the result to x. Of a much more practical nature, you can set up the definitions

#define AND    &&

#define OR     ||

and then write expressions such as

if ( x > 0 AND x < 10 )

and

if ( y == 0 OR y == value )

You can even include a define for the equality test:

#define EQUALS ==

and then write the statement

if ( y EQUALS 0 OR y EQUALS value )

thus removing the very real possibility of mistakenly using a single equal sign for the equality test, as well as improving the statement’s readability.

Although these examples illustrate the power of the #define, you should note that it is commonly considered poor programming practice to redefine the syntax of the underlying language in such a manner. Moreover, it can make it harder for someone else to understand your code.

To make things even more interesting, a defined value can itself reference another defined value. So the two defines

#define PI     3.141592654

#define TWO_PI 2.0 * PI

are perfectly valid. The name TWO_PI is defined in terms of the previously defined name PI, thus obviating the need to spell out the value 3.141592654 again.

Reversing the order of the defines, as in

#define TWO_PI  2.0 * PI

#define PI      3.141592654

is also valid. The rule is that you can reference other defined values in your definitions provided everything is defined at the time the defined name is used in the program.

Good use of defines often reduces the need for comments within the program. Consider the following statement:

if ( year % 4 == 0 && year % 100 != 0 || year % 400 == 0 )

You know from previous programs in this book that the preceding expression tests whether the variable year is a leap year. Now consider the following define and the sub- sequent if statement:

#define IS_LEAP_YEAR year % 4 == 0 && year % 100 != 0 \

|| year % 400 == 0

if ( IS_LEAP_YEAR )

Normally, the preprocessor  assumes that a definition is contained on a single line of the program. If a second line is needed, the final character on the line must be a backslash character. This character signals a continuation to the preprocessor and is otherwise ignored. The same holds true for more than one continuation line; each line to be con- tinued must be ended with a backslash character.

The preceding if statement is far easier to understand than the one shown directly before it. There is no need for a comment as the statement is self-explanatory. The pur- pose that the define IS_LEAP_YEAR serves is analogous to that served by a function.You could have used a call to a function named is_leap_year to achieve the same degree of readability. The choice of which to use in this case is completely subjective. Of course, the is_leap_year function could be made more general than the preceding define because it could be written to take an argument. This would enable you to test if the value of any variable were a leap year and not just the variable year to which the IS_LEAP_YEAR define restricts you. Actually, you can write a definition to take one or more arguments, which leads to our next point of discussion.

4. Arguments  and Macros

IS_LEAP_YEAR can be defined to take an argument called y as follows:

#define IS_LEAP_YEAR(y)   y % 4 == 0 && y % 100 != 0 \

|| y % 400 == 0

Unlike a function, you do not define the type of the argument y here because you are merely performing a literal text substitution and not invoking a function.

Note that no spaces are permitted in the #define statement between the defined name and the left parenthesis of the argument list.

With the preceding definition, you can write a statement such as

if ( IS_LEAP_YEAR (year) )

to test whether the value of year were a leap year, or

if ( IS_LEAP_YEAR (next_year) )

to test whether the value of next_year were a leap year. In the preceding statement, the definition for IS_LEAP_YEAR would be directly substituted inside the if statement, with the argument next_year replacing y wherever it appeared in the definition. So the if statement would actually be seen by the compiler as

if ( next_year % 4 == 0 && next_year % 100 != 0    \

|| next_year % 400 == 0 )

In C, definitions are frequently called macros. This terminology is more often applied to definitions that take one or more arguments. An advantage of implementing something in C as a macro, as opposed to as a function, is that in the former case, the type of the argument is not important. For example, consider a macro called SQUARE that simply squares its argument. The definition

#define SQUARE(x)  x * x

enables you to subsequently write statements, such as

y = SQUARE (v);

to assign the value of v2  to y. The point to be made here is that v can be of type int, or of type long, or of type float, for example, and the same macro can be used. If SQUARE were implemented as a function that took an int argument, for example, you couldn’t use it to calculate the square of a double value. One consideration about macro defini- tions, which might be relevant to your application: Because macros are directly substitut- ed into the program by the preprocessor, they inevitably use more memory space than

an equivalently defined function. On the other hand, because a function takes time to call and to return, this overhead is avoided when a macro definition is used instead.

Although the macro definition for SQUARE is straightforward, there is an interesting pitfall to avoid when defining macros. As has been described, the statement

y = SQUARE (v);

assigns the value of v2 to y.What do you think would happen in the case of the state- ment

y = SQUARE (v + 1);

This statement does not assign the value of (v + 1)2 to y as you would expect. Because the preprocessor performs a literal text substitution of the argument into the macro defi- nition, the preceding expression would actually be evaluated as

y = v + 1 * v + 1;

which would obviously not produce the expected results. To handle this situation prop- erly, parentheses are needed in the definition of the SQUARE macro:

#define SQUARE(x) ( (x) * (x) )

Even though the preceding definition might look strange, remember that it is the entire expression  as given to the SQUARE macro that is literally substituted wherever x appears in the definition. With your new macro definition for SQUARE, the statement

y = SQUARE (v + 1);

is then correctly evaluated as

y = ( (v + 1) * (v + 1) );

The conditional expression operator can be particularly handy when defining macros. The following defines a macro called MAX that gives the maximum of two values:

#define MAX(a,b) ( ((a) > (b)) ? (a) : (b) )

This macro enables you to subsequently write statements such as

limit = MAX (x + y, minValue);

which would assign to limit the maximum of x + y and minValue. Parentheses were placed around the entire MAX definition to ensure that an expression such as

MAX (x, y) * 100

gets evaluated properly; and parentheses were individually placed around each argument to ensure that expressions such as

MAX (x & y, z)

get correctly evaluated. The bitwise AND operator has lower precedence than the > opera- tor used in the macro. Without the parentheses in the macro definition, the > operator would be evaluated before the bitwise AND, producing the incorrect result.

The following macro tests if a character is a lowercase letter:

#define IS_LOWER_CASE(x)  ( ((x) >= ‘a’) && ((x) <= ‘z’) )

and thereby permits expressions such as

if ( IS_LOWER_CASE (c) )

to be written.You can even use this macro in a subsequent macro definition to convert an ASCII character from lowercase to uppercase, leaving any nonlowercase character unchanged:

#define TO_UPPER(x) ( IS_LOWER_CASE (x) ? (x) – ‘a’ + ‘A’ : (x) )

The program loop

while ( *string != ‘\0’ )

{

*string = TO_UPPER (*string);

++string;

}

would sequence through the characters pointed to by string, converting any lowercase characters in the string to uppercase.5

5. Variable Number  of Arguments  to Macros

A macro can be defined to take an indeterminate or variable number of arguments. This is specified to the preprocessor by putting three dots at the end of the argument list. The remaining arguments in the list are collectively referenced in the macro definition by the special identifier _ _VA_ARGS_ _. As an example, the following defines a macro called debugPrintf to take a variable number of arguments:

#define debugPrintf(…) printf (“DEBUG: ” _ _VA_ARGS_ _);

Legitimate macro uses would include debugPrintf (“Hello world!\n”); as well as

debugPrintf (“i = %i, j = %i\n”, i, j);

In the first case, the output would be

DEBUG: Hello world!

And in the second case, if i had the value 100 and j the value 200, the output would be

DEBUG: i = 100, j = 200

The printf call in the first case gets expanded into

printf (“DEBUG: ” “Hello world\n”);

by the preprocessor, which also concatenates the adjacent character string constants together. So the final printf call looks like this:

printf (“DEBUG: Hello world\n”);

6. The # Operator

If you place a # in front of a parameter in a macro definition, the preprocessor creates a constant string out of the macro argument when the macro is invoked. For example, the definition

#define str(x) # x

causes the subsequent invocation

str (testing)

to be expanded into

“testing”

by the preprocessor. The printf call

printf (str (Programming in C is fun.\n));

is therefore equivalent to

printf (“Programming in C is fun.\n”);

The preprocessor literally inserts double quotation marks around the actual macro argu- ment. Any double quotation marks or backslashes in the argument are preserved by the preprocessor. So

str (“hello”)

produces

“\”hello\””

A more practical example of the use of the # operator might be in the following macro definition:

#define printint(var)     printf (# var ” = %i\n”, var)

This macro is used to display the value of an integer variable. If count is an integer vari- able with a value of 100, the statement

printint (count);

is expanded into

printf (“count” ” = %i\n”, count);

which, after string concatenation is performed on the two adjacent strings, becomes

printf (“count = %i\n”, count);

So the # operator gives you a means to create a character string out of a macro argu- ment. Incidentally, a space between the # and the parameter name is optional.

7. The ## Operator

This operator is used in macro definitions to join two tokens together. It is preceded (or followed) by the name of a parameter to the macro. The preprocessor takes the actual argument to the macro that is supplied when the macro is invoked and creates a single token out of that argument and whatever token follows (or precedes) the ##.

Suppose, for example, you have a list of variables x1 through x100.You can write a macro called printx that simply takes as its argument an integer value 1 through 100 and that displays the corresponding x variable  as shown:

#define printx(n)  printf (“%i\n”, x ## n)

The portion of the define that reads

x ## n

says to take the tokens that occur before and after the ## (the letter x and the argument n, respectively) and make a single token out of them. So the call

printx (20);

is expanded into

printf (“%i\n”, x20);

The printx macro can even use the previously defined printint macro to get the vari- able name as well as its value displayed:

#define printx(n)  printint(x ## n)

The invocation

printx (10);

first expands into printint (x10); and then into

printf (“x10″ ” = %i\n”, x10);

and finally into

printf (“x10 = %i\n”, x10);

Source: Kochan Stephen G. (2004), Programming in C: A Complete Introduction to the C Programming Language, Sams; Subsequent edition.

Leave a Reply

Your email address will not be published. Required fields are marked *