Character Strings in C: Escape Characters

As alluded to previously, the backslash character has a special significance that extends beyond its use in forming the newline and null characters.  Just as the backslash and the letter n, when used in combination, cause subsequent printing to begin on a new line, so can other characters be combined with the backslash character to perform special func- tions. These various backslash characters, often referred to as escape characters, are summa- rized in Table 10.2.

The first seven characters listed in Table 10.2 perform the indicated function on most output devices when they are displayed. The audible alert character, \a, sounds a “bell” in most terminal windows. So, the printf call

printf (“\aSYSTEM SHUT DOWN IN 5 MINUTES!!\n”);

sounds an alert and displays the indicated message.

Including the backspace character ‘\b’ inside a character string causes the terminal

to backspace one character at the point at which the character appears in the string, pro- vided that it is supported by the terminal window. Similarly, the function call

printf (“%i\t%i\t%i\n”, a, b, c);

displays the value of a, spaces over to the next tab setting (typically set to every eight columns by default), displays the value of b, spaces over to the next tab setting, and then displays the value of c. The horizontal tab character is particularly useful for lining up data in columns.

To include the backslash character itself inside a character string, two backslash char- acters are necessary, so the printf call

printf (“\\t is the horizontal tab character.\n”);

displays the following:

\t is the horizontal tab character.

Note that because the \\ is encountered first in the string, a tab is not displayed in this case.

To include a double quotation character inside a character string, it must be preceded by a backslash. So, the printf call

printf (“\”Hello,\” he said.\n”);

results in the display of the message

“Hello,” he said.

To assign a single quotation character to a character variable, the backslash character must be placed before the quotation mark. If c is declared to be a variable of type char, the statement

c = ‘\”;

assigns a single quotation character to c.

The backslash character, followed immediately by a ?, is used to represent a ? charac- ter. This is sometimes necessary when dealing with trigraphs in non-ASCII character sets. For more details, consult Appendix A, “C Language Summary.”

The final four entries in Table 10.2 enable any character to be included in a character string. In the escape character ‘\nnn’, nnn is a one- to three-digit octal number. In the escape character ‘\xnn’, nn is a hexadecimal number. These numbers represent the internal code of the character. This enables characters that might not be directly available from the keyboard to be coded into a character string. For example, to include an ASCII escape character, which has the value octal 33, you could include the sequence \033 or \x1b inside your string.

The null character ‘\0’ is a special case of the escape character sequence described in the preceding paragraph. It represents the character that has a value of 0. In fact, because the value of the null character is 0, this knowledge is frequently used by programmers in tests and loops dealing with variable-length character strings. For example, the loop to count the length of a character string in the function stringLength from Program 10.2 can also be equivalently coded as follows:

while ( string[count] )

++count;

The value of string[count] is nonzero until the null character is reached, at which point the while loop is exited.

It should once again be pointed out that these escape characters are only considered a single character inside a string. So, the character string “\033\”Hello\”\n” actually consists of nine characters (not counting the terminating null): the character ‘\033’, the double quotation character ‘\”‘, the five characters in the word Hello, the double quo- tation character once again, and the newline character. Try passing the preceding charac- ter string to the stringLength function to verify that nine is indeed the number of characters in the string (again, excluding the terminating null).

A universal character name is formed by the characters \u followed by four hexadecimal numbers or the characters \U followed by eight hexadecimal numbers. It is used for specifying characters from extended character sets; that is, character sets that require more than the standard eight bits for internal representation. The universal character name escape sequence can be used to form identifier names from extended character sets, as well as to specify 16-bit and 32-bit characters inside wide character string and character string constants. For more information, refer to Appendix A.

Source: Kochan Stephen G. (2004), Programming in C: A Complete Introduction to the C Programming Language, Sams; Subsequent edition.

Leave a Reply

Your email address will not be published. Required fields are marked *