Character Strings in C: Variable-Length Character Strings

You can adopt a similar approach to that used by the concat function for defining other functions to deal with character arrays. That is, you can develop a set of routines, each of which has as its arguments one or more character arrays plus the number of characters contained in each such array. Unfortunately, after working with these functions for a while, you will find that it gets a bit tedious trying to keep track of the number of char- acters contained in each character array that you are using in your program—especially if you are using your arrays to store character strings of varying sizes. What you need is a method for dealing with character arrays without having to worry about precisely how many characters you have stored in them.

There is such a method, and it is based upon the idea of placing a special character at the end of every character string. In this manner, the function can then determine for itself when it has reached the end of a character string after it encounters this special character. By developing all of your functions to deal with character strings in this fashion, you can eliminate the need to specify the number of characters that are con- tained inside a character string.

In the C language, the special character that is used to signal the end of a string is known as the null character and is written as ‘\0’. So, the statement

const char word [] = { ‘H’, ‘e’, ‘l’, ‘l’, ‘o’, ‘!’, ‘\0’ };

defines a character array called word that contains seven characters,  the last of which is the null character. (Recall that the backslash character [\] is a special character in the C language and does not count as a separate character; therefore, ‘\0’ represents a single character in C.) The array word is depicted in Figure 10.2.

To begin with an illustration of how these variable-length character strings are used, write a function that counts the number of characters in a character string, as shown in Program 10.2. Call the function stringLength and have it take as its argument a charac- ter array that is terminated by the null character. The function determines the number of characters in the array and returns this value back to the calling routine. Define the number of characters in the array as the number of characters up to, but not including, the terminating null character. So, the function call

stringLength (characterString)

should return the value 3 if characterString is defined as follows:

char characterString[] = { ‘c’, ‘a’, ‘t’, ‘\0’ };

Program 10.2   Counting  the Characters in a String

// Function to count the number of characters in a string

#include <stdio.h>

int stringLength (const char string[])


int count = 0;

while ( string[count] != ‘\0’ )


return count;


int main (void)


int  stringLength (const char string[]);

const char word1[] = { ‘a’, ‘s’, ‘t’, ‘e’, ‘r’, ‘\0’ };

const char word2[] = { ‘a’, ‘t’, ‘\0’ };

const char word3[] = { ‘a’, ‘w’, ‘e’, ‘\0’ };

printf (“%i %i  %i\n”, stringLength (word1),

stringLength (word2), stringLength (word3));

return 0;


Program 10.2   Output

5  2  3

The stringLength function declares its argument as a const array of characters because it is not making any changes to the array, merely counting its size.

Inside the stringLength function, the variable count is defined and its value set to 0. The program then enters a while loop to sequence through the string array until the null character is reached. When the function finally hits upon this character, signaling the end of the character string, the while loop is exited and the value of count is returned. This value represents the number of characters in the string, excluding the null character. You might want to trace through the operation of this loop on a small character array to verify that the value of count when the loop is exited is in fact equal to the number of characters in the array, excluding the null character.

In the main routine, three character arrays, word1, word2, and word3, are defined. The printf function call displays the results of calling the stringLength function for each of these three character arrays.

1. Initializing  and Displaying  Character Strings

Now, it is time to go back to the concat function developed in Program 10.1 and rewrite it to work with variable-length character strings. Obviously, the function must be changed somewhat because you no longer want to pass as arguments the number of characters in the two arrays. The function now takes only three arguments: the two char- acter arrays to be concatenated and the character array in which to place the result.

Before delving into this program, you should first learn about two nice features that C provides for dealing with character strings.

The first feature involves the initialization of character arrays. C permits a character array to be initialized by simply specifying a constant character string rather than a list of individual characters. So, for example, the statement

char word[] = { “Hello!” };

can be used to set up an array of characters called word with the initial characters ‘H’, ‘e’,‘l’, ‘l’, ‘o’, ‘!’, and ‘\0’, respectively.You can also omit the braces when initializing character arrays in this manner. So, the statement

char word[] = “Hello!”;

is perfectly valid. Either statement is equivalent to the statement

char word[] = { ‘H’, ‘e’, ‘l’, ‘l’, ‘o’, ‘!’, ‘\0’ };

If you’re explicitly specifying the size of the array, make certain you leave enough space for the terminating null character. So, in

char word[7] = { “Hello!” };

the compiler has enough room in the array to place the terminating null character. However, in

char word[6] = { “Hello!” };

the compiler can’t fit a terminating null character at the end of the array, and so it doesn’t put one there (and it doesn’t complain about it either).

In general, wherever they appear in your program, character-string constants in the C language are automatically terminated by the null character. This fact helps functions such as printf determine when the end of a character string has been reached. So, in the call

printf (“Programming in C is fun.\n”);

the null character is automatically placed after the newline character in the character string, thereby enabling the printf function to determine when it has reached the end of the format string.

The other feature to be mentioned here involves the display of character strings. The special format characters %s inside a printf format string can be used to display an array of characters that is terminated by the null character. So, if word is a null-terminated array of characters, the printf call

printf (“%s\n”, word);

can be used to display the entire contents of the word array at the terminal. The printf function assumes when it encounters the %s format characters that the corresponding argument is a character string that is terminated by a null character.

The two features just described were incorporated into the main routine of Program 10.3, which illustrates your revised concat function. Because you are no longer passing the number of characters in each string as arguments to the function, the function must determine when the end of each string is reached by testing for the null character. Also, when str1 is copied into the result array, you want to be certain not to also copy the null character because this ends the string in the result array right there.You do need, however, to place a null character into the result array after str2 has been copied so as to signal the end of the newly created string.

Program 10.3   Concatenating  Character Strings

#include <stdio.h>

int main (void)


void concat (char result[], const char str1[], const char str2[]);

const char s1[] = { “Test ” };

const char s2[] = { “works.” };

char s3[20];

concat (s3, s1, s2);

printf (“%s\n”, s3);

return 0;


// Function to concatenate two character strings

void concat (char result[], const char str1[], const char str2[])


int i, j;

// copy str1 to result

for ( i = 0; str1[i] != ‘\0’; ++i )

result[i] = str1[i];

// copy str2 to result

for ( j = 0; str2[j] != ‘\0’; ++j )

result[i + j] = str2[j];

// Terminate the concatenated string with a null character

result [i + j] = ‘\0’;


Program 10.3   Output

Test works.

In the first for loop of the concat function, the characters contained inside str1 are copied into the result array until the null character is reached. Because the for loop terminates as soon as the null character is matched, it does not get copied into the result array.

In the second loop, the characters from str2 are copied into the result array direct-ly after the final character from str1. This loop makes use of the fact that when the pre- vious for loop finished execution, the value of i was equal to the number of characters in str1, excluding the null character. Therefore, the assignment statement

result[i + j] = str2[j];

is used to copy the characters from str2 into the proper locations of result.

After the second loop is completed, the concat function puts a null character at the end of the string. Study the function to ensure that you understand the use of i and j. Many program errors when dealing with character strings involve the use of an index number that is off by 1 in either direction.

Remember, to reference the first character of an array, an index number of 0 is used. In addition, if a character array string contains n characters, excluding the null byte, then string[n –  1] references the last (nonnull) character in the string, whereas string[n] references the null character. Furthermore, string must be defined to con- tain at least n + 1 characters, bearing in mind that the null character occupies a location in the array.

Returning to the program, the main routine defines two char arrays, s1 and s2, and sets their values using the new initialization technique previously described. The array s3 is defined to contain 20 characters, thus ensuring that sufficient space is reserved for the concatenated character string and saving you from the trouble of having to precisely cal- culate its size.

The concat function is then called with the three strings s1, s2, and s3 as arguments. The result, as contained in s3 after the concat function returns, is displayed using the %s format characters. Although s3 is defined to contain 20 characters, the printf function only displays characters from the array up to the null character.

2. Testing Two  Character Strings for Equality

You cannot directly test two strings to see if they are equal with a statement such as

if ( string1 == string2 )

because the equality operator can only be applied to simple variable types, such as floats, ints, or chars, and not to more sophisticated types, such as structures  or arrays.

To determine if two strings are equal, you must explicitly compare the two character strings character by character. If you reach the end of both character strings at the same time, and if all of the characters up to that point are identical, the two strings are equal; otherwise, they are not.

It might be a good idea to develop a function that can be used to compare two char- acter strings, as shown in Program 10.4.You can call the function equalStrings and have it take as arguments the two character strings to be compared. Because you are only

interested in determining whether the two character strings are equal, you can have the function return a bool value of true (or nonzero) if the two strings are identical, and false (or zero) if they are not. In this way, the function can be used directly inside test statements, such as in

if ( equalStrings (string1, string2) )

Program 10.4   Testing  Strings for Equality

// Function to determine if two strings are equal

#include <stdio.h>

#include <stdbool.h>

bool equalStrings (const char s1[], const char s2[])


int i = 0;

bool areEqual;

while ( s1[i] == s2 [i] &&

s1[i] != ‘\0’ && s2[i] != ‘\0’ )


if ( s1[i] == ‘\0’ && s2[i] == ‘\0’ )

areEqual = true;


areEqual = false;

return areEqual;


int main (void)


bool equalStrings (const char s1[], const char s2[]);

const char stra[] = “string compare test”;

const char strb[] = “string”;

printf (“%i\n”, equalStrings (stra, strb));

printf (“%i\n”, equalStrings (stra, stra));

printf (“%i\n”, equalStrings (strb, “string”));

return 0;


Program 10.4   Output




The equalStrings function uses a while loop to sequence through the character strings s1 and s2. The loop is executed so long as the two character strings are equal (s1[i] == s2[i]) and so long as the end of either string is not reached (s1[i] != ‘\0’ && s2[i]!= ‘\0’). The variable i, which is used as the index number for both arrays, is incre- mented each time through the while loop.

The if statement that executes after the while loop has terminated determines if you have simultaneously reached the end of both strings s1 and s2.You could have used the statement

if ( s1[i] == s2[i] )

instead to achieve the same results. If you are at the end of both strings, the strings must be identical, in which case areEqual is set to true and returned to the calling routine. Otherwise, the strings are not identical and areEqual is set to false and returned.

In main, two character arrays stra and strb are set up and assigned the indicated ini-tial values. The first call to the equalStrings function passes these two character arrays as arguments.  Because these two strings are not equal, the function correctly returns a value of false, or 0.

The second call to the equalStrings function passes the string stra twice. The function correctly returns a true value to indicate that the two strings are equal, as veri- fied by the program’s output.

The third call to the equalStrings function is a bit more interesting. As you can see from this example, you can pass a constant character string to a function that is expecting an array of characters as an argument. In Chapter 11, “Pointers,” you see how this works. The equalStrings function compares the character string contained in strb to the character string “string” and returns true to indicate that the two strings are equal.

3. Inputting Character Strings

By now, you are used to the idea of displaying a character string using the %s format characters. But what about reading in a character string from your window (or your “terminal window”)? Well, on your system, there are several library functions that you can use to input character strings. The scanf function can be used with the %s format characters to read in a string of characters up to a blank space, tab character, or the end of the line, whichever occurs first. So, the statements

char string[81];

scanf (“%s”, string);

have the effect of reading in a character string typed into your terminal window and storing it inside the character array string. Note that unlike previous scanf calls, in the case of reading strings, the & is not placed before the array name (the reason for this is also explained in Chapter 11).

If the preceding scanf call is executed, and the following characters are entered:


the string “Shawshank” is read in by the scanf function and is stored inside the string array. If the following line of text is typed instead:

iTunes playlist

just the string “iTunes” is stored inside the string array because the blank space after the word scanf terminates the string. If the scanf call is executed again, this time the string “playlist” is stored inside the string array because the scanf function always continues scanning from the most recent character that was read in.

The scanf function automatically terminates the string that is read in with a null character. So, execution of the preceding scanf call with the line of text


causes the entire lowercase alphabet to be stored in the first 26 locations of the string array, with string[26] automatically set to the null character.

If s1, s2, and s3 are defined to be character arrays of appropriate sizes, execution of the statement

scanf (“%s%s%s”, s1, s2, s3);

with the line of text

micro computer system

results in the assignment of the string “micro” to s1, “computer” to s2, and “system” to s3. If the following line of text is typed instead:

system expansion

it results in the assignment of the string “system” to s1, and “expansion” to s2.

Because no further characters appear on the line, the scanf function then waits for more input to be entered from your terminal window.

In Program 10.5, scanf is used to read three character strings.

Program 10.5   Reading  Strings with scanf

// Program to illustrate the %s scanf format characters

#include <stdio.h>

int main (void)


char s1[81], s2[81], s3[81];

printf (“Enter text:\n”);

scanf (“%s%s%s”, s1, s2, s3);

printf (“\ns1 = %s\ns2 = %s\ns3 = %s\n”, s1, s2, s3);

return 0;


Program 10.5   Output

Enter text:

system expansion


s1 = system

s2 = expansion

s3 = bus

In the preceding program, the scanf function is called to read in three character strings: s1, s2, and s3. Because the first line of text contains only two character strings—where the definition of a character string to scanf is a sequence of characters up to a space, tab, or the end of the line—the program waits for more text to be entered. After this is done, the printf call is used to verify that the strings “system”, “expansion”, and “bus” are correctly stored inside the string arrays s1, s2, and s3, respectively.

If you type in more than 80 consecutive characters to the preceding program without pressing the spacebar, the tab key, or the Enter (or Return) key, scanf overflows one of the character arrays. This might cause the program to terminate abnormally or cause unpredictable things to happen. Unfortunately, scanf has no way of knowing how large your character arrays are. When handed a %s format, it simply continues to read and store characters until one of the noted terminator characters is reached.

If you place a number after the % in the scanf format string, this tells scanf the max- imum number of characters to read. So, if you used the following scanf call:

scanf (“%80s%80s%80s”, s1, s2, s3);

instead of the one shown in Program 10.5, scanf knows that no more than 80 charac- ters are to be read and stored into either s1, s2, or s3. (You still have to leave room for the terminating null character that scanf stores at the end of the array. That’s why %80s is used instead of %81s.)

4. Single-Character  Input

The standard library provides several functions for the express purposes of reading and writing single characters and entire character strings. A function called getchar can be used to read in a single character from the terminal. Repeated calls to the getchar func- tion return successive single characters from the input. When the end of the line is reached, the function returns the newline character ‘\n’. So, if the characters “abc” are typed at the terminal, followed immediately by the Enter (or Return) key, the first call to the getchar function returns the character ‘a’, the second call returns the character ‘b’, the third call returns ‘c’, and the fourth call returns the newline character ‘\n’. A fifth call to this function causes the program to wait for more input to be entered from the terminal.

You might be wondering why you need the getchar function when you already know how to read in a single character with the %c format characters of the scanf func- tion. Using the scanf function for this purpose is a perfectly valid approach; however, the getchar function is a more direct approach because its sole purpose is for reading in single characters, and, therefore, it does not require any arguments. The function returns a single character that might be assigned to a variable or used as desired by the program.

In many text-processing applications, you need to read in an entire line of text. This line of text is frequently stored in a single place—generally called a “buffer”—where it is processed further. Using the scanf call with the %s format characters does not work in such a case because the string is terminated as soon as a space is encountered in the input.

Also available from the function library is a function called gets. The sole purpose of this function—you guessed it—is to read in a single line of text. As an interesting pro- gram exercise, Program 10.6 shows how a function similar to the gets function—called readLine here—can be developed using the getchar function. The function takes a sin- gle argument: a character array in which the line of text is to be stored. Characters read from the terminal window up to, but not including, the newline character are stored in this array by the function.

Program 10.6   Reading  Lines of Data

#include <stdio.h>

int main (void)


int  i;

char line[81];

void readLine (char buffer[]);

for ( i = 0; i < 3; ++i )


readLine (line);

printf (“%s\n\n”, line);


return 0;


// Function to read a line of text from the terminal

void readLine (char buffer[])


char character;

int  i = 0;



character = getchar ();

buffer[i] = character;



while ( character != ‘\n’ );

buffer[i – 1] = ‘\0’;


Program 10.6 Output

This is a sample line of text.

This is a sample line of text.



runtime library routines

runtime library routines

The do loop in the readLine function is used to build up the input line inside the char- acter array buffer. Each character that is returned by the getchar function is stored in the next location of the array. When the newline character is reached—signaling the end of the line—the loop is exited. The null character is then stored inside the array to ter- minate the character string, replacing the newline character that was stored there the last time that the loop was executed. The index number i –  1 indexes the correct position in the array because the index number was incremented one extra time inside the loop the last time it was executed.

The main routine defines a character array called line with enough space reserved to hold 81 characters. This ensures that an entire line (80 characters has historically been used as the line length of a “standard terminal”) plus the null character can be stored inside the array. However, even in windows that display 80 or fewer characters per line, you are still in danger of overflowing the array if you continue typing past the end of the line without pressing the Enter (or Return) key. It is a good idea to extend the readLine function to accept as a second argument the size of the buffer. In this way, the function can ensure that the capacity of the buffer is not exceeded.

The program then enters a for loop, which simply calls the readLine function three times. Each time that this function is called, a new line of text is read from the terminal. This line is simply echoed back at the terminal to verify proper operation of the func- tion. After the third line of text has been displayed, execution of Program 10.6 is then complete.

For your next program example (see Program 10.7), consider a practical text-process- ing application: counting the number of words in a portion of text. Develop a function called countWords, which takes as its argument a character string and which returns the number of words contained in that string. For the sake of simplicity, assume here that a word is defined as a sequence of one or more alphabetic characters. The function can scan the character string for the occurrence of the first alphabetic character and consid- ers all subsequent characters up to the first nonalphabetic character as part of the same word. Then, the function can continue scanning the string for the next alphabetic character, which identifies the start of a new word.

Program 10.7   Counting Words

// Function to determine if a character is alphabetic

#include <stdio.h>

#include <stdbool.h>

bool alphabetic (const char c)


if ( (c >= ‘a’ && c <= ‘z’) || (c >= ‘A’ && c <= ‘Z’) )

return true;


return false;


/* Function to count the number of words in a string */

int countWords (const char string[])


int  i, wordCount = 0;

bool lookingForWord = true, alphabetic (const char c);

for ( i = 0; string[i] != ‘\0’; ++i )

if ( alphabetic(string[i]) )


if ( lookingForWord )



lookingForWord = false;




lookingForWord = true;

return wordCount;


int main (void)


const char text1[] = “Well, here goes.”;

const char text2[] = “And here we go… again.”;

int  countWords (const char string[]);

printf (“%s – words = %i\n”, text1, countWords (text1));

printf (“%s – words = %i\n”, text2, countWords (text2));

return 0;


Program 10.7 Output

Well, here goes. – words = 3

And here we go… again. – words = 5

The alphabetic function is straightforward enough—it simply tests the value of the character passed to it to determine if it is either a lowercase or uppercase letter. If it is either, the function returns true, indicating that the character is alphabetic; otherwise, the function returns false.

The countWords function is not as straightforward. The integer variable i is used as an index number to sequence through each character in the string. The integer variable lookingForWord is used as a flag to indicate whether you are currently in the process of looking for the start of a new word. At the beginning of the execution of the function, you obviously are looking for the start of a new word, so this flag is set to true. The local variable wordCount is used for the obvious purpose of counting the number of words in the character string.

For each character inside the character string, a call to the alphabetic function is made to determine whether the character is alphabetic. If the character is alphabetic, the lookingForWord flag is tested to determine if you are in the process of looking for a new word. If you are, the value of wordCount is incremented by 1, and the lookingForWord flag is set to false, indicating that you are no longer looking for the start of a new word.

If the character is alphabetic and the lookingForWord flag is false, this means that you are currently scanning inside a word. In such a case, the for loop is continued with the next character in the string.

If the character is not alphabetic—meaning either that you have reached the end of a word or that you have still not found the beginning of the next word—the flag lookingForWord is set to true (even though it might already be true).

When all of the characters inside the character string have been examined, the func-tion returns the value of wordCount to indicate the number of words that were found in the character string.

It is helpful to present a table of the values of the various variables in the countWords function to see how the algorithm works. Table 10.1 shows such a table, with the first call to the countWords function from the preceding program as an example. The first line of Table 10.1 shows the initial value of the variables wordCount and lookingForWord before the for loop is entered. Subsequent lines depict the values of the indicated variables each time through the for loop. So, the second line of the table

shows that the value of wordCount has been set to 1 and the lookingForWord flag set to false (0) after the first time through the loop (after the ‘W’ has been processed). The last line of the table shows the final values of the variables when the end of the string is reached.You should spend some time studying this table, verifying the values of the indi- cated variables against the logic of the countWords function. After this has been accom- plished, you should then feel comfortable with the algorithm that is used by the function to count the number of words in a string.

5. The Null String

Now consider a slightly more practical example of the use of the countWords function. This time, you make use of your readLine function to allow the user to type in multiple lines of text at the terminal window. The program then counts the total number of words in the text and displays the result.

To make the program more flexible, you do not limit or specify the number of lines of text that are entered. Therefore, you must have a way for the user to “tell” the pro- gram when he is done entering text. One way to do this is to have the user simply press the Enter (or Return) key an extra time after the last line of text has been entered. When the readLine function is called to read in such a line, the function immediately encounters the newline character and, as a result, stores the null character as the first (and only) character in the buffer.Your program can check for this special case and can know that the last line of text has been entered after a line containing no characters has been read.

A character string that contains no characters other than the null character has a spe- cial name in the C language; it is called the null string.When you think about it, the use of the null string is still perfectly consistent with all of the functions that you have defined so far in this chapter. The stringLength function correctly returns 0 as the size of the null string; your concat function also properly concatenates “nothing” onto the end of another string; even your equalStrings function works correctly if either or both strings are null (and in the latter case, the function correctly calls these strings equal).

Always remember that the null string does, in fact, have a character in it, albeit a null one.

Sometimes, it becomes desirable to set the value of a character string to the null string. In C, the null string is denoted by an adjacent pair of double quotation marks. So, the statement

char buffer[100] = “”;

defines a character array called buffer and sets its value to the null string. Note that the character string “” is not the same as the character string ” ” because the second string contains a single blank character. (If you are doubtful, send both strings to the equalStrings function and see what result comes back.)

Program 10.8 uses the readLine, alphabetic, and countWords functions from previ-ous programs. They have not been shown in the program listing to conserve space.

Program 10.8   Counting Words in a Piece  of Text

#include <stdio.h>

#include <stdbool.h>

/***** Insert alphabetic function here *****/

/***** Insert readLine function here *****/

/***** Insert countWords function here *****/

int main (void)


char text[81];

int  totalWords = 0;

int  countWords (const char string[]);

void readLine (char buffer[]);

bool endOfText = false;

printf (“Type in your text.\n”);

printf (“When you are done, press ‘RETURN’.\n\n”);

while ( ! endOfText )


readLine (text);

if ( text[0] == ‘\0’ )

endOfText = true;


totalWords += countWords (text);


printf (“\nThere are %i words in the above text.\n”, totalWords);

return 0;


Program 10.8   Output

Type in your text.

When you are done, press ‘RETURN’.

Wendy glanced up at the ceiling where the mound of lasagna loomed like a mottled mountain range. Within seconds, she was crowned with ricotta ringlets and a tomato sauce tiara. Bits of beef formed meaty moles on her forehead. After the second thud, her culinary coronation was complete.


There are 48 words in the above text.

The line labeled Enter indicates the pressing of the Enter or Return key.

The endOfText variable is used as a flag to indicate when the end of the input text has been reached. The while loop is executed as long as this flag is false. Inside this loop, the program calls the readLine function to read a line of text. The if statement then tests the input line that is stored inside the text array to see if just the Enter (or Return) key was pressed. If so, then the buffer contains the null string, in which case the endOfText flag is set to true to signal that all of the text has been entered.

If the buffer does contain some text, the countWords function is called to count the number of words in the text array. The value that is returned by this function is added into the value of totalWords, which contains the cumulative number of words from all lines of text entered thus far.

After the while loop is exited, the program displays the value of totalWords, along with some informative text, at the terminal.

It might seem that the preceding program does not help to reduce your work efforts much because you still have to manually enter all of the text at the terminal. But as you will see in Chapter 16, “Input and Output Operations in C,” this same program can also be used to count the number of words contained in a file stored on a disk, for example. So, an author using a computer system for the preparation of a manuscript might find this program extremely valuable  as it can be used to quickly determine the number of words contained in the manuscript (assuming the file is stored as a normal text file and not in some word processor format like Microsoft Word).

Source: Kochan Stephen G. (2004), Programming in C: A Complete Introduction to the C Programming Language, Sams; Subsequent edition.

Leave a Reply

Your email address will not be published. Required fields are marked *