Working with Larger Programs in C: Dividing Your Program into Multiple Files

In every program that you’ve seen so far, it was assumed that the entire program was entered into a single file—presumably via some text editor, such as emacs, vim, or some Windows-based editor—and then compiled and executed. In this single file, all the func- tions that the program used were included—except, of course, for the system functions, such as printf and scanf. Standard header files such as <stdio.h> and <stdbool.h> were also included for definitions and function declarations. This approach works fine when dealing with small programs—that is, programs that contain up to 100 statements or so. However, when you start dealing with larger programs, this approach no longer suffices. As the number of statements in the program increases, so does the time it takes to edit the program and to subsequently recompile it. Not only that, large programming applications frequently require the efforts of more than one programmer. Having every- one work on the same source file, or even on their own copy of the same source file, is unmanageable.

C supports the notion of modular programming in that it does not require that all the statements for a particular program be contained in a single file. This means that you can enter your code for a particular module into one file, for another module into a different file, and so on. Here, the term module refers either to a single function or to a number of related functions that you choose to group logically.

If you’re working with a windows-based project management tool, such as Metrowerks’ CodeWarrior, Microsoft Visual Studio, or Apple’s Xcode, then working with multiple source files is easy.You simply have to identify the particular files that belong to the project on which you are working, and the software handles the rest for you. The next section describes how to work with multiple files if you’re not using such a tool, also known as an Integrated Development Environment (IDE). That is, the next section assumes you are compiling programs from the command line by directly issuing gcc or cc commands, for example.

1. Compiling Multiple Source Files from the Command Line

Suppose you have conceptually divided your program into three modules and have entered the statements for the first module into a file called mod1.c, the statements for the second module into a file called mod2.c, and the statements for your main routine into the file main.c. To tell the system that these three modules actually belong to the same program, you simply include the names of all three files when you enter the com- mand to compile the program. For example, using gcc, the command

$ gcc mod1.c mod2.c main.c –o dbtest

has the effect of separately compiling the code contained in mod1.c, mod2.c, and main.c. Errors discovered in mod1.c, mod2.c, and main.c are separately identified by the compiler. For example, if the gcc compiler gives output that looks like this:

mod2.c:10: mod2.c: In function ‘foo’:

mod2.c:10: error: ‘i’ undeclared (first use in this function)

mod2.c:10: error: (Each undeclared identifier is reported only once

mod2.c:10: error: for each function it appears in.)

then the compiler indicates that mod2.c has an error at line 10, which is in the function foo. Because no messages are displayed for mod1.c and main.c, no errors are found compiling those modules.

Typically, if there are errors discovered in a module, you have to edit the module to correct the mistakes.1 In this case, because an error was discovered only inside mod2.c, you have to edit only this file to fix the mistake.You can then tell the C compiler to recompile your modules after the correction has been made:

$ gcc mod1.c mod2.c main.c –o dbtest

Because no error message was reported, the executable was placed in the file dbtest.

Normally, the compiler generates intermediate object files for each source file that it compiles. The compiler places the resulting object code from compiling mod.c into the file mod.o by default. (Most Windows compilers work similarly, only they might place the resulting object code into .obj files instead of .o files.) Typically, these intermediate object files are automatically deleted after the compilation process ends. Some C compil- ers (and, historically, the standard Unix C compiler) keep these object files around and do not delete them when you compile more than one file at a time. This fact can be used to your advantage for recompiling a program after making a change to only one or several of your modules. So in the previous example, because mod1.c and main.c had no compiler errors, the corresponding .o files—mod1.o and main.o—would still be around after the gcc command completed. Replacing the c from the filename mod.c with an o tells the C compiler to use the object file that was produced the last time mod.c was compiled. So, the following command line could be used with a compiler (in this case, cc) that does not delete the object code files:

$ cc mod1.o mod2.c main.o –o dbtest

So, not only do you not have to reedit mod1.c and main.c if no errors are discovered by the compiler, but you also don’t have to recompile them.

If your compiler automatically deletes the intermediate .o files, you can still take advantage of performing incremental compilations by compiling each module separately and using the –c command-line option. This option tells the compiler not to link your file (that is, not to try to produce an executable) and to retain the intermediate object file that it creates. So, typing

$ gcc –c mod2.c

compiles the file mod2.c, placing the resulting executable in the file mod2.o.

So, in general, you can use the following sequence to compile your three-module program dbtest using the incremental compilation technique:

$ gcc –c mod1.c Compile mod1.c => mod1.o

$ gcc –c mod2.c Compile mod2.c => mod2.o

$ gcc –c main.c Compile main.c => main.o

$ gcc mod1.o mod2.o mod3.o –o dbtest Create executable

The three modules are compiled separately. The previous output shows no errors were detected by the compiler. If any were, the file could be edited and incrementally recom- piled. The last line that reads

$ gcc mod1.o mod2.o mod3.o

lists only object files and no source files. In this case, the object files are just linked together to produce the executable output file dbtest.

If you extend the preceding examples to programs that consist of many modules, you can see how this mechanism of separate compilations can enable you to develop large programs more efficiently. For example, the commands

$ gcc –c legal.c Compile legal.c, placing output in legal.o

$ gcc legal.o makemove.o exec.o enumerator.o evaluator.o display.o –o superchess

could be used to compile a program consisting of six modules, in which only the mod- ule legal.c needs to be recompiled.

As you’ll see in the last section of this chapter, the process of incremental compilation can be automated by using a tool called make. The IDE tools that were mentioned at the beginning of this chapter invariably have this knowledge of what needs recompilation, and they only recompile files as necessary.

Source: Kochan Stephen G. (2004), Programming in C: A Complete Introduction to the C Programming Language, Sams; Subsequent edition.

1. Compiling Multiple Source Files from the Command Line

Leave a Reply Cancel reply

Login