Process Management in Unix/Linux: A Multitasking System

1. Multitasking

In general, multitasking refers to the ability of performing several independent activities at the same time. For example, we often see people talking on their cell phones while driving. In a sense, these people are doing multitasking, although a very bad kind. In computing, multitasking refers to the execution of several independent tasks at the same time. In a uniprocessor (single CPU) system, only one task can execute at a time. Multitasking is achieved by multiplexing the CPU’s execution time among different tasks, i.e. by switching the CPU’s execution from one task to another. The mechanism of switching executions among different tasks is called context switching, which changes the execution environment of one task to that of another. If the switch is fast enough, it gives the illusion that all the tasks are executing simultaneously. This logical parallelism is called concurrency. In multiprocessor systems with many CPUs or processor cores, tasks can execute on different CPUs in parallel in real time. In addition, each processor may also do multitasking by executing different tasks concurrently. Multitasking is the basis of all operating systems. It is also the basis of concurrent programming in general.

2. The Process Concept

An operating system is a multitasking system. In an operating system, tasks are also called processes. For all practical purposes, the terms task and process can be used interchangeably. In Chap. 2, we defined an execution image as a memory area containing the execution’s code, data and stack. Formally, we define process as

2.1. A process is the execution of an image.

It is a sequence of executions regarded by the OS kernel as a single entity for using system resources. System resources include memory space, I/O devices and, most importantly, CPU time. In an OS kernel, each process is represented by a unique data structure, called the Process Control Block (PCB) or Task Control Block (TCB), etc. In this book, we shall simply call it the PROC structure. Like a personal record, which contains all the information of a person, a PROC structure contains all the information of a process. In a real OS, the PROC structure may contain many fields and quite large. To begin with, we shall define a very simple PROC structure to represent processes.

In the PROC structure, next is a pointer to the next PROC structure. It is used to maintain the PROCs in various dynamic data structures, such as link lists and queues. The field ksp is the saved stack pointer. When a process gives up the CPU, it saves the execution context in stack and saves the stack pointer in PROC.ksp for resumption later. Among the other fields of the PROC structure, pid is the process ID number that identifies a process, ppid is the parent process ID number, status is the current status of the process, priority is the process scheduling priority and kstack is the process stack when it is executing. An OS kernel usually defines a finite number of PROC structures in its data area, denoted by

 PROC proc[NPROC];    // NPROC a constant, e.g. 64

which are used to represent processes in the system. In a single CPU system, only one process can be executing at a time. The OS kernel usually uses a global PROC pointer, running or current, to point at the PROC that is currently executing. In a multiprocessor OS with multiple CPUs, many processes may be executing on different CPUs in parallel in real time. Accordingly, in a MP system running [NCPU] may be an array of pointers, each points at a PROC running on a specific CPU. For simplicity, we shall only consider single CPU systems.

3. A Multitasking System

In order for the reader to have a better understanding of multitasking and processes, we begin with a programming example, which is designed to illustrate the principles of multitasking, context switching and processes. The program implements a multitasking environment, which simulates the kernel mode operations of an operating system. The multitasking system, denoted by MT, consists of the following components.

3.1. type.h file

The type.h file defines system constants and a simple PROC structure to represent processes.

/*********** type.h file ************/

#define NPROC 9             // number of PROCs

#define SSIZE 1024          // stack size = 4KB

// PROC status

#define FREE      0

#define  READY    1

#define  SLEEP    2

#define  ZOMBIE   3

As we expand the MT system, we shall add more fields to the PROC structure later.

3.2. The ts.s file

The ts.s file implements process context switching in 32-bit GCC assembly code, which will be explained in later sections.

3.3. The queue.c file

The queue.c file implements queue and list operation functions. The function enqueue() enters a PROC into a queue by priority. In a priority queue, PROCs of the same priority are ordered by First-In-First- Out (FIFO). The function dequeue() returns the first element removed from a queue or a link list. The function printList() prints the elements of a link list.

/***************** queue.c file *****************/

int enqueue(PROC **queue, PROC *p)

{

PROC *q = *queue;

if (q == 0 || p->priority > q->priority){

*queue = p;

p->next = q;

}

else{

while (q->next && p->priority <= q->next->priority)

q = q->next;

p->next = q->next;

q->next = p;

}

}

PROC *dequeue(PROC **queue)

{

PROC *p = *queue; if (p)

*queue = (*queue)->next;

return p;

}

int printList(char *name, PROC *p)

{

printf(“%s = “, name);

while(p){

printf(“[%d %d]->”, p->pid, p->priority);

p = p->next;

}

printf(“NULL\n”);

}

3.4. The t.c file

The t.c file defines the MT system data structures, system initialization code and process management functions.

/*********** t.c file of A Multitasking System *********/

#include <stdio.h>

#include “type.h”

PROC proc[NPROC];  // NPROC PROCs
PROC *freeList;    // freeList of PROCs
PROC *readyQueue;  // priority queue of READY procs
PROC *running;     // current running proc pointer

#include “queue.c” // include queue.c file

/*******************************************************

kfork() creates a child process; returns child pid.

When scheduled to run, child PROC resumes to body();

********************************************************/

int kfork()

{

int i;

PROC *p = dequeue(&freeList); if (!p){

printf(“no more proc\n”);

return(-1);

}

/* initialize the new proc and its stack */

p->status = READY;

p->priority = 1;         // ALL PROCs priority=1,except P0

p->ppid = running->pid;

/************ new task initial stack contents ************

kstack contains:  |retPC|eax|ebx|ecx|edx|ebp|esi|edi|eflag|

                     -1  -2   -3  -4  -5  -6  -7  -8   -9

**********************************************************/

for (i=1; i<10; i++)               // zero out kstack cells

p->kstack[SSIZE – i] =0;

p->kstack[SSIZE-1] = (int)body;    // retPC -> body()

p->ksp = &(p->kstack[SSIZE – 9]);   // PROC.ksp -> saved eflag

enqueue(&readyQueue, p);            // enter p into readyQueue

return p->pid;

}

int kexit()

{

running->status = FREE;

running->priority = 0;

enqueue(&freeList, running);

printList(“freeList”, freeList);

tswitch();

}

int do_kfork()

{

int child = kfork();

if (child < 0)

printf(“kfork failed\n”);

else{

printf(“proc %d kforked a child = %d\n”, running->pid, child);

printList(“readyQueue”, readyQueue);

}

return child;

}

int do_switch()

{

tswitch();

}

int do_exit

{

kexit();

}

int body() // process body function

{

int c;

printf(“proc %d starts from body()\n”, running->pid);

while(1){

printf(“***************************************\n”);

printf(“proc %d running: parent=%d\n”, running->pid,running->ppid);

printf(“enter a key [f|s|q]   : “);

c = getchar(); getchar();    // kill the \r key

switch(c){

case  ‘f’: do_kfork();     break;

case  ‘s’: do_switch();    break;

case  ‘q’: do_exit();      break;

}

}

}

// initialize the MT system; create P0 as initial running process

int init()

{

int i;

PROC *p;

for (i=0; i<NPROC; i++){ // initialize PROCs

p = &proc[i];

p->pid = i;            // PID = 0 to NPROC-1

p->status = FREE;

p->priority = 0;

p->next = p+1;

}

proc[NPROC-1].next = 0;

freeList = &proc[0];  // all PROCs in freeList
readyQueue = 0;  // readyQueue = empty

// create P0 as the initial running process

p = running = dequeue(&freeList); // use proc[0]

p->status = READY;

p->ppid =0;              // P0 is its own parent

printList(“freeList”, freeList);

printf(“init complete: P0 running\n”);

}

/*************** main() function ***************/

int main()

{

printf(“Welcome to the MT Multitasking System\n”);

init(); // initialize system; create and run P0 kfork();

// kfork P1 into readyQueue while(1){

printf(“P0: switch process\n”);

if (readyQueue)

tswitch();

}

}

/*********** scheduler *************/

int scheduler()

{

printf(“proc %d in scheduler()\n”, running->pid);

if (running->status == READY)

enqueue(&readyQueue, running);

printList(“readyQueue”, readyQueue);

running = dequeue(&readyQueue);

printf(“next running = %d\n”, running->pid);

}

5. Explanations of the Multitasking System Code

We explain the base code of the MT multitasking system by the following steps.

(1). The Virtual CPU: The MT system is compile-linked under Linux as

gcc -m32 t.c ts.s

Then run a.out. The entire MT system runs as Linux process in user mode. Within the Linux process, we create independent execution entities called tasks and schedule them to run inside the Linux process by our own scheduling algorithm. To the tasks in the MT system, the Linux process behaves as a virtual CPU. Barring any confusion, we shall refer to the execution entities in the MT system as either tasks or processes.

(2). init(): When the MT system starts, main() calls init() to initialize the system. Init() initializes the

PROC structures and enters them into a freeList. It also initializes the readyQueue to empty. Then it uses proc[0] to create P0 as the initial running process. P0 has the lowest priority 0. All other tasks will have priority 1, so that they will take turn to run from the readyQueue.

(3). P0 calls kfork() to create a child process P1 with priority 1 and enter it into the ready queue. Then

      P0 calls tswitch(), which will switch task to run P1.

(4) . tswitch(): The tswitch() function implements process context switching. It acts as a process switch box, where one process goes in and, in general, another process emerges. tswitch() consists of 3 separated steps, which are explained in more detail below.

(4).1. SAVE part of tswitch(): When an executing task calls tswitch(), it saves the return address on stack and enters tswitch() in assembly code. In tswitch(), the SAVE part saves CPU registers into the calling task’s stack and saves the stack pointer into proc.ksp. The Intel x86 CPU in 32-bit mode has many registers, but only the registers eax, ebx, ecx, edx, ebp, esi, edi and eflag are visible to a Linux process in user mode, which is the virtual CPU of the MT system. So we only need to save and restore these registers of the virtual CPU. The following diagram shows the stack contents and the saved stack pointer of the calling task after executing the SAVE part of tswitch(), where xxx denote the stack contents before calling tswitch().

   proc.ksp

 |

|xxx|retPC|eax|ebx|ecx|edx|ebp|esi|edi|eflag|

In the Intel x86 based PCs in 32-bit mode, every CPU register is 4 bytes wide and stack operations are always in units of 4 bytes. Therefore, we may define each PROC stack as an integer array in the PROC structure.

(4).2. scheduler(): After executing the SAVE part of tswitch(), the task calls scheduler() to pick the next running task. In scheduler(), if the calling task is still READY to run, it calls enqueue() to put itself into the readyQueue by priority. Otherwise, it will not be in the readyQueue, which makes it non-runnable. Then it calls dequeue(), which returns the first PROC removed from readyQueue as the new running task.

(4).3. RESUME part of tswitch(): When execution returns from scheduler(), running may have changed to point at the PROC of a different task. Whichever PROC running points at, it is the current running task. The RESUME part of tswitch() sets the CPU’s stack pointer to the saved stack pointer of the current running task. Then it pops saved registers, followed by RET, causing the current running task return to where it called tswitch() earlier.

(5). kfork(): The kfork() function creates a child task and enters it into the readyQueue. Every newly created task begins execution from the same body() function. Although the new task never existed before, we may pretend that it not only existed before but also ran before. The reason why it is not running now is because it called tswitch() to give up CPU earlier. If so, its stack must contain a frame saved by the SAVE part of tswitch(), and its saved ksp must point at the stack top. Since the new task never really ran before, we may assume that its stack was empty and all CPU register contents are 0 when it called tswitch(). Thus, in kfork(), we initialize the stack of the new task as follows.

  proc.ksp

 |

|< – all saved registers = 0 ->|

|body|eax|ecx|edx|ebx|ebp|esi|edi|eflags|

-1    -2  -3  -4  -5  -6  -7  -8    -9

where the index -i means SSIZE-i. These are done by the following code segment in kfork().

/************** task initial stack contents **************

kstack contains:       |retPC|eax|ebx|ecx|edx|ebp|esi|edi|eflag|

-1    -2  -3  -4  -5  -6  -7  -8   -9

**********************************************************/

for (i=1; i<10; i++)                                // zero out kstack cells

p->kstack[SSIZE-i] = 0;

p->kstack[SSIZE-1] = (int)body; // retPC -> body()

p->ksp = &(p->kstack[SSIZE-9]); // PROC.ksp -> saved eflag

When the new task starts to run, it begins by executing the RESUME part of tswitch(), which causes it to return to the entry address of the body() function. When execution first enters body(), the task’s stack is logically empty. As soon as execution starts, the task’ s stack will grow and shrink as described in Chap. 2 on stack usage and stack frames. In practice, the body() function never returns, so there is no need for a return address in the stack. For variations of the initial stack contents, the reader may consult Problem 3.2 in the Problem section.

(6). body(): For demonstration purpose, all tasks are created to execute the same body() function. This shows the difference between processes and programs. Many processes may execute the same program code, but each process executes in its own context. For instance, all (automatic) local variables in body() are private to the process since they are all allocated in the per process stack. If the body() function calls other functions, all the calling sequences are maintained in the per process stack, etc. While executing in body(), a process prompts for a command char = [flsiq], where

f : kfork a new child process to execute body();

s : switch process;

q : terminate and return the PROC as FREE to freeList

(6). The Idle Task P0: P0 is special in that it has the lowest priority among all the tasks. After system initialization, P0 creates P1 and switches to run P1. P0 will run again if and only if there are no runnable tasks. In that case, P0 simply loops. It will switch to another task whenever readyQueue becomes nonempty. In the base MT system, P0 runs again if all other processes have terminated. To end the MT system, the user may enter Control-C to kill the Linux process.

(7). Running the MT Multitasking System: Under Linux, enter

gcc -m32 t.c s.s

to compile-link the MT system and run the resulting a.out Figure 3.1 shows the sample outputs of running the MT multitasking system. The figures show the MT system initialization, the initial process P0, which creates P1 and switches task to run P1. P1 kforks a child process P2 and switches task to run P2. While a process is running, the reader may enter other commands to test the system.

Source: Wang K.C. (2018), Systems Programming in Unix/Linux, Springer; 1st ed. 2018 edition.

Leave a Reply

Your email address will not be published. Required fields are marked *