System Calls for File Operations in Unix/Linux: The stat Systen Call

The syscalls, stat/lstat/fstat, return the information of a file. The command man 2 stat displays the man pages of the stat system call, which are shown below for discussions.

1. Stat File Status

1.1. STAT(2) Linux Programmer’s Manual STAT(2)

NAME

stat, fstat, lstat – get file status

SYNOPSIS

#include <sys/types.h>

#include <sys/stat.h>

#include <unistd.h>

int stat(const char *file_name, struct stat *buf);

int fstat(int filedes, struct stat *buf);

int lstat(const char *file_name, struct stat *buf);

DESCRIPTION

These functions return information about the specified file. You do not need any access rights to the file to get this information but you need search rights to all directories named in the path leading to the file.

stat stats the file pointed to by filename and fills in buf with stat information.

lstat is identical to stat, except in the case of a symbolic link, where the link itself is stated, not the file that it refers to. So the difference between stat and lstat is: stat follows link but lstat does not.

fstat is identical to stat, only the open file pointed to by filedes (as returned by open(2)) is stated in place of filename.

2. The stat Structure

All the stat syscalls return information in a stat structure, which contains the following fields:

The st_size field is the size of the file in bytes. The size of a symlink is the length of the pathname it contains, without a trailing NULL.

The value st_blocks gives the size of the file in 512-byte blocks. (This may be smaller than st_size/ 512, e.g. when the file has holes.) The value st_blksize gives the “preferred” blocksize for efficient file system I/O. (Writing to a file in smaller chunks may cause an inefficient read-modify-rewrite.)

Not all of the Linux file systems implement all of the time fields. Some file system types allow mounting in such a way that file accesses do not cause an update of the st_atime field. (See ‘noatime’ in mount(8).)

The field st_atime is changed by file accesses, e.g. by exec(2), mknod(2), pipe(2), utime(2) and read (2) (of more than zero bytes). Other routines, like mmap(2), may or may not update st_atime.

The field st_mtime is changed by file modifications, e.g. by mknod(2), truncate(2), utime(2) and write(2) (of more than zero bytes). Moreover, st_mtime of a directory is changed by the creation or deletion of files in that directory. The st_mtime field is not changed for changes in owner, group, hard link count, or mode.

The field st_ctime is changed by writing or by setting inode information (i.e., owner, group, link count, mode, etc.).

The following POSIX macros are defined to check the file type:

RETURN VALUE: On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

SEE ALSO chmod(2), chown(2), readlink(2), utime(2)

3. Stat and File Inode

Stat and file inode: First, we clarify how stat works. Every file has a unique inode data structure, which contains all the information of the file. The following shows the inode structure of EXT2 file systems in Linux.

struct ext2_inode{

u16 i_mode;

u16 i_uid;

u32 i_size;

u32 i_atime;

u32 i_ctime;

u32 i_mtime;

u32 i_dtime;

u16 i_gid;

u16 i_links_count;

u32 i_blocks;

u32 i_flags;

u32 i_reserved1;

u32 i_block[15];

u32    pad[7];

}; // inode=128 bytes in ext2/3 FS; 256 bytes in ext4

Each inode has a unique inode number (ino) on a storage device. Each device is identified by a pair of (major, minor) device numbers, e.g. 0x0302 means /dev/hda2, 0x0803 means /dev/sda3, etc. The stat syscall simply finds the file’s inode and copies information from inode into the stat structure, except st_dev and st_ino, which are the file’s device and inode numbers, respectively. In Unix/Linux, all time fields are the number of seconds elapsed since 0 hour, 0 minute, 0 second of January 1, 1970. They can be converted to calendar form by the library function ctime(&time).

4. File Type and Permissions

In the stat structure, most fields are self-explanatory. Only the st_mode field needs some explanation:

mode_t st_mode; /* copied from i_mode of INODE */

The TYPE of st_mode is a u16 (16 bits).The 16 bits have the following meaning:

The leading 4 bits are file types, which can be interpreted as (in octal)

As of now, the man pages of all Unix-like systems still use octal numbers, which has its roots dated back to the old PDP-11 era in the 1970’s. For convenience, we shall redefine them in HEX, which are much more readable, e.g.

S_IFDIR     0x4000   directory

S_IFREG     0x8000   regular file

S_IFLNK     0xA000   symbolic link

The next 3 bits of st_mode are flags, which indicate special usage of the file

S_ISUID    0004000    set  UID bit

S_ISGID    0002000    set  GID bit

S_ISVTX    0001000    sticky bit

We shall show the meaning and usage of setuid programs later. The remaining 9 bits are permission bits for file protection. They are divided into 3 categories by the (effective) uid and gid of the process:

owner  group   other

rwx    rwx    rwx

By interpreting these bits, we may     display the st_mode field as

-rwxr-xr-x         (REG file with r,x but w by owner  only)

drwxr-xr-x         (DIR  with r,x, but w by owner only)

lrw-r–r–         (LNK  file with permissions)

where the first letter C-idil) shows the file type and the next 9 chars are based on the permission bits. Each char is printed as r|w|x if the bit is 1 or – if the bit is 0. For directory files, the x bit means whether access (cd into) to the directory is allowed or not.

5. Opendir-Readdir Functions

A directory is also a file. We should be able to open a directory for READ, then read and display its contents just like any other ordinary file. However, depending on the file system, the contents of a directory file may vary. As a result, the user may not be able to read and interpret the contents of directory files properly. For this reason, POSIX specifies the following interface functions to directory files.

#include <dirent.h>

DIR *open(dirPath); // open a directory named dirPath for READ

struct dirent *readdir(DIR *dp); // return a dirent pointer

In Linux, the dirent structure is

struct dirent{

u32 d_ino; // inode number u16 d_reclen;

char d_name[ ]

}

In the dirent structure, only the d_name field is mandated by POSIX. The other fields are system dependent. opendir() returns a DIR pointer dirp. Each readdir(dirp) call return a dirent pointer to an dirent structure of the next entry in the directory. It returns a NULL pointer when there are no more entries in the directory. We illustrate their usage by an example. The following code segment prints all the file names in a directory.

#include <dirent.h> struct dirent *ep;

DIR *dp = opendir(“dirname”);

while (ep = readdir(dp)){

printf(“name=%s “, ep->d_name);

}

The reader may consult the man 3 pages of opendir and readddir for more details.

6. Readlink Function

Linux’s open() syscall follow symlinks. It is therefore not possible to open a symlink file and read its contents. In order to read the contents of symlink files, we must use the readlink syscal, which is

int readlink(char *pathname, char buf[ ], int bufsize);

It copies the contents of a symlink file into buf[ ] of bufsize, and returns the actual number of bytes copied.

7. The Is Program

The following shows a simple ls program which behaves like the ls -l command of Linux. The purpose here is not to re-invent the wheel by writing yet another ls program. Rather, it is intended to show how to use the various syscalls to display information of files under a directory. By studying the example program code, the reader should also be able to figure out how Linux’s ls command is implemented.

/************* myls.c file **********/

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <sys/stat.h>

#include <time.h>

#include <sys/types.h>

#include <dirent.h>

struct stat mystat, *sp;

char *t1 = “xwrxwrxwr——- “;

char *t2 = “————— “;

int ls_file(char *fname)

{

struct stat fstat, *sp;

int r, i;

char ftime[64];

sp = &fstat;

if ( (r = lstat(fname, &fstat)) < 0){

printf(“can’t stat %s\n”, fname);

exit(1);

}

if ((sp->st_mode & 0xF000) == 0x8000) // if (S_ISREG()) printf(“%c”,’-‘);

if ((sp->st_mode & 0xF000) == 0x4000) // if (S_ISDIR()) printf(“%c”,’d’);

if ((sp->st_mode & 0xF000) == 0xA000) // if (S_ISLNK()) printf(“%c”,’l’);

for (i=8; i >= 0; i–){

if (sp->st_mode & (1 << i)) // print r|w|x

printf(“%c”, t1[i]);

else

printf(“%c”, t2[i]);

}

printf(“%4d   “,sp->st_nlink);  // link   count

printf(“%4d   “,sp->st_gid);   //  gid

printf(“%4d   “,sp->st_uid);   //  uid

printf(“%8d   “,sp->st_size);  // file   size

// print time

strcpy(ftime, ctime(&sp->st_ctime)); // print time in calendar form

ftime[strlen(ftime)-1] = 0;    // kill \n at end

printf(“%s “,ftime);

// print name

printf(“%s”, basename(fname)); // print file basename

// print -> linkname if symbolic file

if ((sp->st_mode & 0xF000)== 0xA000){

// use readlink() to read linkname

printf(” -> %s”, linkname); // print linked name

}

printf(“\n”);

}

int ls_dir(char *dname)

{

// use opendir(), readdir(); then call ls_file(name)

}

int main(int argc, char *argv[])

{

struct stat mystat, *sp = &mystat;

int r;

char *filename, path[1024], cwd[256];

filename = “./”;   // default to CWD

if (argc > 1)

filename = argv[1]; // if specified a filename

if (r = lstat(filename, sp) < 0){

printf(“no such file %s\n”, filename);

exit(1);

} strcpy(path, filename);

if (path[0] != ‘/’){ // filename is relative : get CWD path

getcwd(cwd, 256);

strcpy(path, cwd); strcat(path, “/”); strcat(path,filename);

}

if (S_ISDIR(sp->st_mode))

ls_dir(path);

else

ls_file(path);

}

Exercise 1: fill in the missing code in the above example program, i.e. the ls_dir() and readlink() functions, to make it work for any directory.

Source: Wang K.C. (2018), Systems Programming in Unix/Linux, Springer; 1st ed. 2018 edition.

Leave a Reply

Your email address will not be published. Required fields are marked *