Process Management in Unix/Linux: I/O Redirection

1. FILE Streams and File Descriptors

Recall that the sh process has three FILE streams for terminal I/O: stdin, stdout and stderr. Each is a pointer to a FILE structure in the execution image’s HEAP area, as shown below.

FILE *stdin——- > FILE structure

——————————————-

char fbuf[SIZE]

int counter, index, etc.

int fd = 0;   // fd[0] in PROC <== from KEYBOARD

——————————————-

FILE *stdout —– > FILE structure

——————————————-

char fbuf[SIZE]

int counter, index, etc.

int fd = 1;   // fd[1] in PROC ==> to SCREEN

——————————————-

FILE *stderr —– > FILE structure

——————————————-

char fbuf[SIZE]

int counter, index, etc.

int fd = 2;   // fd[2] in PROC ==> to SCREEN also

——————————————-

Each FILE stream corresponds to an opened file in the Linux kernel. Each opened file is represented by a file descriptor (number). The file descriptors of stdin, stdout, stderr are 0, 1,2, respectively. When a process forks a child, the child inherits all the opened files of the parent. Therefore, the child also has the same FILE streams and file descriptors as the parent.

2. FILE Stream I/O and System Call

When a process executes the library function

scanf(“%s”, &item);

it tries to input a (string) item from stdin, which points to a FILE structure. If the FILE structure’s fbuf[ ] is empty, it issues a READ system call to the Linux kernel to read data from the file descriptor 0, which is mapped to the keyboard of a terminal (/dev/ttyX) or a pseudo-terminal (/dev/pts/#).

3. Redirect stdin

If we replace the file descriptor 0 with a newly opened file, inputs would come from that file rather than the original input device. Thus, if we do

#include <fcntl.h>                 // contains O_RDONLY, O_WRONLY,O_APPEND, etc

close(0);                          // syscall to close file descriptor 0

int fd=open(“filename”, O RDONLY); // open filename for READ,

                                   // fd replace 0

The syscall close(0) closes the file descriptor 0, making 0 an unused file descriptor. The open() syscall opens a file and uses the lowest unused descriptor number as the file descriptor. In this case, the file descriptor of the opened file would be 0. Thus the original fd 0 is replaced by the newly opened file. Alternatively, we may also use

int fd = open(“filename”, O_RDOMLY); // get a fd first

close(0);                            // zero out fd[0]

dup(fd);                             // duplicate fd to 0

The syscall dup(fd) duplicates fd into the lowest numbered and unused file descriptor, allowing both fd and 0 to access the same opened file. In addition, the syscall

dup2(fd1, fd2)

duplicates fd1 into fd2, closing fd2 first if it was already open. Thus, Unix/Linux provides several ways to replace/duplicate file descriptors. After any one of the above operations, the file descriptor 0 is either replaced or duplicated with the opened file, so that every scanf() call will get inputs from the opened file.

4. Redirect stdout

When a process executes the library function

printf(“format=%s\n”, items);

it tries to write to the fbuf[ ] in the stdout FILE structure, which is line buffered. If fbuf[ ] has a complete line, it issues a WRITE syscall to write data from fbuf[ ] to file descriptor 1, which is mapped to the terminal screen. To redirect the standard outputs to a file, do as follows.

close(1);

open(“filename”, O_WRONLY|O_CREAT, 0644);

These change file descriptor 1 to point to the opened filename. Then the outputs to stdout will go to that file instead of the screen. Likewise, we may also redirect stderr to a file. When a process terminates (in Kernel), it closes all opened files.

Source: Wang K.C. (2018), Systems Programming in Unix/Linux, Springer; 1st ed. 2018 edition.

Leave a Reply

Your email address will not be published. Required fields are marked *