File Operations in Unix/Linux: Introduction to EXT2 File System

For many years, Linux used EXT2 (Card et al. 1995; EXT2 2001) as the default file system. EXT3 (EXT3 2015) is an extension of EXT2. The main addition in EXT3 is a journal file, which records changes made to the file system in a journal log. The log allows for quicker recovery from errors in case of a file system crash. An EXT3 file system with no error is identical to an EXT2 file system. The newest extension of EXT3 is EXT4 (Cao et al. 2007). The major change in EXT4 is in the allocation of disk blocks. In EXT4, block numbers are 48 bits. Instead of discrete disk blocks, EXT4 allocates contiguous ranges of disk blocks, called extents.

1. EXT2 File System Data Structures

Under Linux, we can create a virtual disk containing a simple EXT2 file system as follows.

  1.  dd if=/dev/zero of=mydisk bs=1024 count=1440
  2.  mke2fs -b 1024 mydisk 1440

The resulting EXT2 file system has 1440 blocks, each of block size 1KB bytes. We choose 1440 blocks because it is the number of blocks of (old) floppy disks. The resulting disk image can be used directly as a virtual (floppy) disk on most virtual machines that emulate the Intel x86 based PCs, e.g. QUEM, VirtualBox and Vmware, etc. The layout of such an EXT2 file system is shown in Figure 7.4.

For ease of discussion, we shall assume this basic file system layout first. Whenever appropriate, we point out the variations, including those in large EXT2/3 FS on hard disks. The following briefly explains the contents of the disk blocks.

Block#0: Boot Block: B0 is the boot block, which is not used by the file system. It is used to contain a booter program for booting up an OS from the disk.

2. Superblock

Block#1: Superblock: (at byte offset 1024 in hard disk partitions): B1 is the superblock, which contains information about the entire file system. Some of the important fields of the superblock structure are shown below.

The meanings of most superblock fields are obvious. Only a few deserve more explanation.

s_first_data_block = 0 for 4KB block size and 1 for 1KB block size. It is used to determine the start block of group descriptors, which is s_first_data_block + 1.

s_log_block_size determines the file block size, which is 1KB*(2**s_log_block_size), e.g.. 0 for 1KB block size, 1 for 2KB block size and 2 for 4KB block size, etc. The most often used block size is 1KB for small file systems and 4KB for large file systems.

s_mnt_count= number of times the file system has been mounted. When the mount count reaches the max_mount_count, a fsck session is forced to check the file system for consistency.

s_magic is the magic number which identifies the file system type. For EXT2/3/4 files systems, the magic number is 0xEF53.

3. Group Descriptor

Block#2: Group Descriptor Block (s_first_data_block+1 on hard disk): EXT2 divides disk blocks into groups. Each group contains 8192 (32K on HD) blocks. Each group is described by a group descriptor structure.

u16 bg_used_dirs_count;

u16 bg_pad;                  // ignore these

u32 bg_reserved[3];

};

Since a FD has only 1440 blocks, B2 contains only 1 group descriptor. The rest are 0’s. On hard disks with a large number of groups, group descriptors may span many blocks. The most important fields in a group descriptor are bg_block_bitmap, bg_inode_bitmap and bg_inode_table, which point to the group’s blocks bitmap, inodes bitmap and inodes start block, respectively. For the Linux formatted EXT2 file system, blocks 3 to 7 are reserved. So bmap=8, imap=9 and inode_table= 10.

4. Bitmaps

Block#8: Block Bitmap (Bmap): (bg_block_bitmap): A bitmap is a sequence of bits used to represent some kind of items, e.g. disk blocks or inodes. A bitmap is used to allocate and deallocate items. In a bitmap, a 0 bit means the corresponding item is FREE, and a 1 bit means the corresponding item is IN_USE. A FD has 1440 blocks but block#0 is not used by the file system. So the Bmap has only 1439 valid bits. Invalid bits are treated as IN_USE and set to 1’s.

Block#9: Inode Bitmap (Imap): (bg_inode_bitmap): An inode is a data structure used to represent a file. An EXT2 file system is created with a finite number of inodes. The status of each inode is represented by a bit in the Imap in B9. In an EXT2 FS, the first 10 inodes are reserved. So the Imap of an empty EXT2 FS starts with ten 1’s, followed by 0’s. Invalid bits are again set to 1’s.

5. Inodes

Block#10: Inodes (begin) Block: (bg_inode_table): Every file is represented by a unique inode structure of 128 (256 in EXT4) bytes. The essential inode fields are listed below.

In the inode structure, i_mode is a u16 or 2-byte unsigned integer.

| 4  | 3 |   9    |

i_mode =  |tttt|ugs|rwxrwxrwx|

In the i_mode field, the leading 4 bits specify the file type. For example, tttt= 1000 for REG file, 0100 for DIR, etc. The next 3 bits ugs indicate the file’s special usage. The last 9 bits are the rwx permission bits for file protection.

The i_size field is the file size in bytes. The various time fields are number of seconds elapsed since 0 hour, 0 minute, 0 second of January 1, 1970. So each time filed is a very large unsigned integer. They can be converted to calendar form by the library function

char *ctime(&time_field)

which takes a pointer to a time field and returns a string in calendar form. For example,

printf(“%s”, ctime(&inode.i_atime);       // note: pass & of time field

prints i_atime in calendar form.

The i_block[15] array contains pointers to disk blocks of a file, which are

Direct blocks:                  i_block[0] to i_block[11], which point to direct disk blocks.

Indirect blocks:               i_block[12] points to a disk block, which contains 256 (for 1KB BLKSIZE) block numbers, each points to a disk block.

Double Indirect blocks:  i_block[13] points to a block, which points to 256 blocks, each of which points to 256 disk blocks.

Double Indirect blocks:  i_block[14] is the triple-indirect block. We may ignore this for “small” EXT2 file systems.

The inode size (128 or 256) is designed to divides block size (1KB or 4KB) evenly, so that every inode block contains an integral number of inodes. In the simple EXT2 file system, the number of inodes is (a Linux default) 184. The number of inodes blocks is equal to 184/8=23. So the inodes blocks include B10 to B32. Each inode has a unique inode number, which is the inode’s position in the inode blocks plus 1. Note that inode positions count from 0, but inode numbers count from 1. A 0 inode number means no inode. The root directory’s inode number is 2. Similarly, disk block numbers also count from 1 since block 0 is never used by a file system. A zero block number means no disk block.

Data Blocks: Immediately after the inodes blocks are data blocks for file storage. Assuming 184 inodes, the first real data block is B33, which is i_block[0] of the root directory /.

6. Directory Entries

EXT2 Directory Entries: A directory contains dir_entry structures, which is

struct ext2_dir_entry_2{

u32 inode;                // inode number; count from 1, NOT 0

u16 rec_len;              // this entry’s length in bytes

u8 name_len;              // name length in bytes

u8 file_type;             // not used

char name[EXT2_NAME_LEN]; // name: 1-255 chars, no ending NULL

};

The dir_entry is an open-ended structure. The name field contains 1 to 255 chars without a terminating NULL byte. So the dir_entry’s rec_len also varies.

\Source: Wang K.C. (2018), Systems Programming in Unix/Linux, Springer; 1st ed. 2018 edition.

Leave a Reply

Your email address will not be published. Required fields are marked *