Process Management in Unix/Linux: Pipes

Pipes are unidirectional inter-process communication channels for processes to exchange data. A pipe has a read end and a write end. Data written to the write end of a pipe can be read from the read end of the pipe. Since their debut in the original Unix, pipes have been incorporated into almost all OS, with many variations. Some systems allow pipes to be bidirectional, in which data can be transmitted in both directions. Ordinary pipes are for related processes. Named pipes are FIFO communication channels between unrelated processes. Reading and writing pipes are usually synchronous and blocking. Some systems support non-blocking and asynchronous read/write operations on pipes. For the sake of simplicity, we shall consider a pipe as a finite-sized FIFO communication channel between a set of related processes. Reader and writer processes of a pipe are synchronized in the following manner. When a reader reads from a pipe, if the pipe has data, the reader reads as much as it needs (up to the pipe size) and returns the number of bytes read. If the pipe has no data but still has writers, the reader waits for data. When a writer writes data to a pipe, it wakes up the waiting readers, allowing them to continue. If the pipe has no data and also no writer, the reader returns 0. Since readers wait for data if the pipe still has writers, a 0 return value means only one thing, namely the pipe has no data and also no writer. In that case, the reader can stop reading from the pipe. When a writer writes to a pipe, if the pipe has room, it writes as much as it needs to or until the pipe is full, i.e. no more room. If the pipe has no room but still has readers, the writer waits for room. When a reader reads data from the pipe to create more room, it wakes up the waiting writers, allowing them to continue. However, if a pipe has no more readers, the writer must detect this as a broken pipe error and aborts.

1. Pipe Programming in Unix/Linux

In Unix/Linux, pipes are supported by a set of pipe related syscalls. The syscall

int pd[2]; // array of 2 integers

int r = pipe(pd);  // return value r=0 if OK, -1 if failed

 creates a pipe in kernel and returns two file descriptors in pd[2], where pd[0] is for reading from the pipe and pd[1] is for writing to the pipe. However, a pipe is not intended for a single process. For example, after creating a pipe, if the process tries to read even 1 byte from the pipe, it would never return from the read syscall. This is because when the process tries to read from the pipe, there is no data yet but the pipe has a writer, so it waits for data. But who is the writer? It’s the process itself. So the process waits for itself, thereby locking itself up so to speak. Conversely, if the process tries to write more than the pipe size (4KB in most cases), the process would again wait for itself when the pipe becomes full. Therefore, a process can only be either a reader or a writer on a pipe, but not both. The correct way of using pipes is as follows. After creating a pipe, the process forks a child process to share the pipe. During fork, the child inherits all the opened file descriptors of the parent. Thus the child also has pd[0] for read from the pipe and pd[1] for write to the pipe. The user must designate one of the processes as a writer and the other one as a reader of the pipe. The order does not matter as long as each process is designated to play only a single role. Assume that the parent is chosen as the writer and the child as the reader. Each process must close its unwanted pipe descriptor, i.e. writer must close its pd [0] and reader must close its pd[1]. Then the parent can write to the pipe and the child can read from the pipe. Figure 7 shows the system model of pipe operations.

On the left-hand side of Fig. 3.7, a writer process issues a

write(pd[1], wbuf, nbytes)

syscall to enter the OS kernel. It uses the file descriptor pd[1] to access the PIPE through the writeOFT. It executes write_pipe() to write data into the PIPE’s buffer, waiting for room if necessary.

On the right-hand side of Fig. 3.7, a reader process issues a

read(pd[0],rbuf,nbytes)

syscall to enter the OS kernel. It uses the file descriptor pd[0] to access the PIPE through the readOFT. Then it executes read_pipe() to read data from the PIPE’s buffer, waiting for data if necessary.

The writer process may terminate first when it has no more data to write, in which case the reader may continue to read as long as the PIPE still has data. However, if the reader terminates first, the writer should see a broken pipe error and also terminate.

Note that the broken pipe condition is not symmetrical. It is a condition of a communication channel in which there are writers but no reader. The converse is not a broken pipe since readers can still read as long as the pipe has data. The following program demonstrates pipes in Unix/Linux.

Example 3.7: The example program C3.7 demonstrates pipe operations.

/*************** C3 7- pipe Operations ************/ #include <stdio.h>

#include <stdlib.h>

#include <string.h>

int pd[2], n, i;

char line[256];

int main()

{

pipe(pd);            // create a pipe

printf(“pd=[%d, %d]\n”, pd[0], pd[1]);

if (fork()){         // fork a child to share the pipe

printf(“parent %d close pd[0]\n”, getpid());

close(pd[0]); // parent as pipe WRITER

while(i++ < 10){ // parent writes to pipe 10 times

printf(“parent %d writing to pipe\n”, getpid());

n = write(pd[1], “I AM YOUR PAPA”, 16);

printf(“parent %d wrote %d bytes to pipe\n”, getpid(), n);

}

printf(“parent %d exit\n”, getpid());

}

else{

printf(“child %d close pd[1]\n”, getpid());

close(pd[1]); // child as pipe READER

while(1){       // child read from pipe

printf(“child %d reading from pipe\n”, getpid());

if ((n = read(pd[0], line, 128))){ // try to read 128 bytes

line[n]=0;

printf(“child read %d bytes from pipe: %s\n”, n, line);

}

else // pipe has no data and no writer exit(0);

}

}

}

The reader may compile and run the program under Linux to observe its behavior. Figure 3.8 shows the sample outputs of running the Example 3.7 program.

In the above pipe program, both the parent and child will terminate normally. The reader may modify the program to do the following experiments and observe the results.

  1. Let the parent be the reader and child be the writer.
  2. Let the writer write continuously and the reader only read a few times.

In the second case, the writer should terminate by a BROKEN_PIPE error.

2. Pipe Command Processing

In Unix/Linux, the command line

cmdl | cmd2

contains a pipe symbol ‘|’. Sh will run cmd1 by a process and cmd2 by another process, which are connected by a PIPE, so that the outputs of cmd1 become the inputs of cmd2. The following shows typical usages of pipe commands.

ps x | grep “httpd”     # show lines of ps x containing httpd

cat filename | more     # display one screen of text at a time

3. Connect PIPE writer to PIPE reader

  •  When sh gets the command line cmd1 | cmd2, it forks a child sh and waits for the child sh to terminate as usual.
  • Child sh: scan the command line for | symbol. In this case,

cmdl | cmd2

has a pipe symbol |. Divide the command line into head=cmd1, tail=cmd2

  • Then the child sh executes the following code segment

int pd[2];

pipe(pd);        //  creates a PIPE

pid = fork();    // fork a child (to share the PIPE)

if (pid){        //     parent as pipe  WRITER

close(pd[0]);    //   WRITER MUST close pd[0]

close(l);        //  close 1

dup(pd[1]);  // replace 1 with pd[1]

close(pd[1]); // close pd[1]

exec(head); // change image to cmdl

} else{ // child as pipe READER

close(pd[1]); // READER MUST close pd[1]

close(0);

dup(pd[0]); // replace 0 with pd[0]

close(pd[0]);  // close pd[0]

exec(tail); // change image to cmd2

}

The pipe writer redirects its fd=1 topd[1], and the pipe reader redirects its fd=0topd[0]. Thus, the two processes are connected through the pipe.

4. Named pipes

Named pipes are also called FIFOs. They have “names” and exist as special files within the file system. They exist until they are removed with rm or unlink. They can be used with unrelated process, not just descendants of the pipe creator.

Examples of named pipe

(1). From the sh, create a named pipe by the mknod command

mknod mypipe p

(2). OR from C program, issue the mknod() syscall

int r = mknod(“mypipe”, S_IFIFO, 0);

Either (1) or (2) creates a special file named mypipe in the current directory. Enter

ls -l mypipe

will show it as

prw-r—r— 1 root root 0 time mypipe

where the file type p means it’s a pipe, link count =1 and size=0

(3). Processes may access named pipe as if they are ordinary files. However, write to and read from named pipes are synchronized by the Linux kernel

The following diagram shows the interactions of writer and reader processes on a named pipe via sh commands. It shows that the writer stops if no one is reading from the pipe. The reader stops if the pipe has no data.

Instead of sh commands, processes created in C programs may also use named pipes.

Example 3.8: The example program C3.8 demonstrates named pipe operations. It shows how to open a named pipe for read/write by different processes.

/******* C3.8: Create and read/write named pipe ********/

(3).1 Writer process program:

#include <stdio.h>

#include <sys/stat.h>

#include <fcntl.h>

char *line = “tesing named pipe”;

int main()

{

int fd;

mknod(“mypipe”, I_SFIFO, 0);      // create a named pipe

fd = open(“mypipe), O_WRONLY);    // open named pipe for write

write (fd, line, strlen(line));   // write to pipe close (fd);

}

(3).2. Reader process program:

#include <stdio.h>

#include <sys/stat.h>

#include <fcntl.h>

int main()

{

char buf[128];

int fd = open(“mypipe”, O_RDONLY);

read(fd, buf, 128); printf (“%s\n”, buf);

close (fd);

}

Source: Wang K.C. (2018), Systems Programming in Unix/Linux, Springer; 1st ed. 2018 edition.

Leave a Reply

Your email address will not be published. Required fields are marked *