Block Device I/O Buffers

In Chap. 11, we showed the algorithms of read/write regular files. The algorithms rely on two key operations, get_block and put_block, which read/write a disk block to/from a buffer in memory. Since disk I/O are slow in comparison with memory access, it is undesirable to do disk I/O on every read/ write file operation. For this reason, most file systems use I/O buffering to reduce the number of physical I/O to/from storage devices. A well designed I/O buffering scheme can significantly improve file I/O efficiency and increase system throughput.

The basic principle of I/O buffering is very simple. The file system uses a set of I/O buffers as a cache memory for block devices. When a process tries to read a disk block identified by (dev, blk), it first searches the buffer cache for a buffer already assigned to the disk block. If such a buffer exists and contains valid data, it simply reads from the buffer without reading the block from disk again. If such a buffer does not exist, it allocates a buffer for the disk block, reads data from disk into the buffer, then reads data from the buffer. Once a block is read in, the buffer will be kept in the buffer cache for the next read/write requests of the same block by any process. Similarly, when a process writes to a disk block, it first gets a buffer assigned to the block. Then it writes data to the buffer, marks the buffer as dirty for delay write and releases it to the buffer cache. Since a dirty buffer contains valid data, it can be used to satisfy subsequent read/write requests for the same block without incurring real disk I/O. Dirty buffers will be written to disk only when they are to be reassigned to different blocks.

Before discussing buffer management algorithms, we first introduce the following terms. In read_file/write_file, we have assumed that they read/write from/to a dedicated buffer in memory. With I/O buffering, the buffer will be allocated dynamically from a buffer cache. Assume that BUFFER is the structure type of buffers (defined below) and getblk(dev, blk) allocates a buffer assigned to (dev, blk) from the buffer cache. Define a bread(dev, blk) function, which returns a buffer (pointer) containing valid data.

BUFFER *bread(dev,blk) // return a buffer containing valid data {

BUFFER *bp = getblk(dev,blk); // get a buffer for (dev,blk)

if (bp data valid)

return bp;

bp->opcode = READ;             // issue READ operation

start_io(bp);                 // start I/O on device

wait for I/O completion;

return bp;


After reading data from a buffer, the process releases the buffer back to the buffer cache by brelse(bp). Similarly, define a write_block(dev, blk, data) function as

write_block(dev, blk, data)       // write data from U space


BUFFER *bp = bread(dev,blk); // read in the disk block first

write data to bp;

(synchronous write)? bwrite(bp) : dwrite(bp);


where bwrite(bp) is for synchronous write and dwrite(bp) is for delay-write, as shown below.

Synchronous write waits for the write operation to complete. It is used for sequential or removable block devices, e.g. USB drives. For random access devices, e.g. hard disks, all writes can be delay writes. In delay write, dwrite(bp) marks the buffer as dirty and releases it to the buffer cache. Since dirty buffers contain valid data, they can be used to satisfy subsequent read/write requests of the same block. This not only reduces the number of physical disk I/O but also improves the buffer cache effect. A dirty buffer will be written to disk only when it is to be reassigned to a different disk block, at which time the buffer is written out by

awrite(BUFFER *bp)


bp->opcode = ASYNC;      // for ASYNC write;



awrite() calls start_io() to start I/O operation on the buffer but does not wait for the operation to complete. When an ASYNC write operation completes, the disk interrupt handler will release the buffer.

Physical block device I/O: Each device has an I/O queue which contains buffers of pending I/O. The start_io() operation on a buffer is

start_io(BUFFER *bp)


enter bp into device I/O queue;

if (bp is first buffer in I/O queue)

issue I/O command for bp to device;


When an I/O operation completes, the device interrupt handler finishes the I/O operation on the current buffer and starts I/O for the next buffer in the I/O queue if it is non-empty. The algorithm of the device interrupt handler is



bp = dequeue(device I/O queue); // bp = remove head of I/O queue

(bp->opcode == ASYNC)? brelse(bp) : unblock process on bp;

if (!empty(device I/O queue))

issue I/O command for first bp in I/O queue;


Source: Wang K.C. (2018), Systems Programming in Unix/Linux, Springer; 1st ed. 2018 edition.

Leave a Reply

Your email address will not be published. Required fields are marked *