Character Data Type and Operations in C++

A character data type represents a single character.

In addition to processing numeric values, you can process characters in C++. The character data type, char, is used to represent a single character. A character literal is enclosed in single quotation marks. Consider the following code:

char letter = ‘A’;

char numChar = ‘4’;

The first statement assigns character A to the char variable letter. The second statement assigns digit character 4 to the char variable numChar.

1. ASCII Code

Computers use binary numbers internally. A character is stored in a computer as a sequence of 0s and 1s. Mapping a character to its binary representation is called encoding. There are different ways to encode a character. How characters are encoded is defined by an encoding scheme.

Most computers use ASCII (American Standard Code for Information Interchange), an 8-bit encoding scheme for representing all uppercase and lowercase letters, digits, punctua­tion marks, and control characters. Table 4.4 shows the ASCII code for some commonly used characters. Appendix B, “The ASCII Character Set,” gives a complete list of ASCII charac­ters and their decimal and hexadecimal codes. On most systems, the size of the char type is 1 byte.

2. Reading a Character from the Keyboard

To read a character from the keyboard, use

cout << “Enter a character:

char ch;

cin >> ch; // Read a character

cout << “The character read is ” << ch << endl;

3. Escape Sequences for Special Characters

Suppose you want to print a message with quotation marks in the output. Can you write a statement like this?

cout << “He said “Programming is fun”” << endl;

No, this statement has a compile error. The compiler thinks the second quotation character is the end of the string and does not know what to do with the rest of the characters.

To overcome this problem, C++ uses a special notation to represent special characters, as shown in Table 4.5. This special notation, called an escape sequence, consists of a backslash (\) followed by a character or a combination of digits. For example, \t is an escape sequence for the Tab character. The symbols in an escape sequence are interpreted as a whole rather than individually. An escape sequence is considered as a single character.

So, now you can print the quoted message using the following statement:

cout << “He said \”Programming is fun\”” << endl;

The output is

He said “Programming is fun”

Note that the symbols \ and ” together represent one character.

The backslash \ is called an escape character. It is a special character. To display this escape character character, you have to use an escape sequence \\. For example, the following code

 cout << “\\t is a tab character” << endl;

displays

\t is a tab character

tab character Note

4. Casting between char and Numeric Types

A char can be cast into any numeric type, and vice versa. When an integer is cast into a char, only its lower 8 bits of data are used; the other part is ignored. For example,

char c = 0XFF41; // The lower 8 bits hex code 41 is assigned to c

cout << c;       // variable c is character A

When a floating-point value is cast into a char, the floating-point value is first cast into an int, which is then cast into a char.

char c = 65.25;     // 65 is assigned to variable c

cout << c;          // variable c is character A

When a char is cast into a numeric type, the character’s ASCII is cast into the specified numeric type. For example:

int i = ‘A’;             // The ASCII code of character A is assigned to i

cout << i;              // variable i is 65

The char type is treated as if it were an integer of the byte size. All numeric operators can be applied to char operands. A char operand is automatically cast into a number if the other operand is a number or a character. For example, the following statements

// The ASCII code for ‘2’ is 50 and for ‘3’ is 51

int i = ‘2’ + ‘3’;

cout << “i is ” << i << endl; // i is now 101

int j = 2 + ‘a’; // The ASCII code for ‘a’ is 97

cout << “j is ” << j << endl;

cout << j << ” is the ASCII code for character ” <<

static_cast<char>(j) << endl;

display

i is 101

j is 99

99 is the ASCII code for character c

Note that the static_cast<char>(value) operator explicitly casts a numeric value into a character.

As shown in Table 4.4, ASCII codes for lowercase letters are consecutive integers starting from the ASCII code for ‘a’, then for ‘b’, ‘c’, . . . , and ‘z’. The same is true for the upper­case letters and numeric characters. Furthermore, the ASCII code for ‘a’ is greater than the code for ‘A’. You can use these properties to convert an uppercase letter to lowercase or vice versa. Listing 4.2 gives a program that prompts the user to enter a lowercase letter and finds its corresponding uppercase letter.

Listing 4.2 ToUppercase.cpp

1 #include <iostream>

2 using namespace std;

3

4 int main()

5 {

6     cout << “Enter a lowercase letter: “;

7     char lowercaseLetter;

8     cin >> lowercaseLetter;

9

10    char uppercaseLetter =

11    static_cast<char>(‘A’ + (lowercaseLetter – ‘a’));

12

13    cout << “The corresponding uppercase letter is “

14    << uppercaseLetter << endl;

15

16    return 0;

17 }

Note that for a lowercase letter ch1 and its corresponding uppercase letter ch2, ch1 – ‘a’ is the same as ch2 – ‘A‘. Hence, ch2 = ‘A’ + ch1 – ‘a’. Therefore, the cor­responding uppercase letter for lowercaseLetter is static_cast<char>(‘A’ + (lowercaseLetter – ‘a’)) (line 11). Note that lines 10-11 can be replaced by

char uppercaseLetter = ‘A’ + (lowercaseLetter – ‘a’);

Since uppercaseLetter is declared as a char type value, C++ automatically converts the int value ‘A’ + (lowercaseLetter – ‘a’) to a char value.

5. Comparing and Testing Characters

Two characters can be compared using the relational operators just like comparing two num­bers. This is done by comparing the ASCII codes of the two characters. For example,

‘a’ < ‘b’ is true because the ASCII code for ‘a’ (97) is less than the ASCII code for ‘b’ (98).

‘a’ < ‘A’ is false because the ASCII code for ‘a’ (97) is greater than the ASCII code for ‘A’ (65).

‘1’ < ‘8’ is true because the ASCII code for ‘1’ (49) is less than the ASCII code for ‘8’ (56).

Often in the program, you need to test whether a character is a number, a letter, an uppercase letter, or a lowercase letter. For example, the following code tests whether a character ch is an uppercase letter.

if (ch >= ‘A’ && ch <= ‘Z’)

cout << ch << ” is an uppercase letter” << endl;

else if (ch >= ‘a’ && ch <= ‘z’)

cout << ch << ” is a lowercase letter” << endl;

else if (ch >= ‘0’ && ch <= ‘9’)

cout << ch << ” is a numeric character” << endl;

Source: Liang Y. Daniel (2013), Introduction to programming with C++, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *