Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C

The e File System

0.00/5 (No votes)
5 Sep 2013GPL35 min read 16.3K   107  
It can be used for archiving files as all reads and writes occur on one actual file in the underlying system. Additionally, it can be used to stream data out in a client-server environment. Further, it can be used as a file system for a hobbyist operating system or embedded operating system.

Introduction

The e file system which is named after the Euler constant e = 2.718 is a basic file system that aims to support a UNIX like interface. The system is currently designed to run in user space where all data is stored in a single file despite it representing multiple files and directories. The system currently supports most UNIX system calls such as open, close, read, write, and lseek. It can be used for archiving files as all reads and writes occur on one actual file in the underlying system. Additionally, it can be used to stream data out in a client-server environment. Further, it can be used as a file system for a hobbyist operating system or embedded operating system.

Background

The basic design of the e File System is the focus on a single block of 512 bytes. While this can be configured to be a larger constant, all reads and writes are based on this block size which in the code is represented by the C macro DEV_BLOCK_SIZE. The system sets aside approximately 20 percent of the blocks for inodes and the remaining 80 percent for data. The initial space in the available file system is currently reserved for loading a kernel and represents the first 0 to 4095 blocks, or two megabytes. The first block is block 4096 which is the root block describing the file system. Its a C structure name master_inode whose definition is :=

C++
typedef struct master_inode { 
   short magic;           /* Magic to identify the file system. */ 
   short init_fs;         /* 1 if fs is initialized, 0 otherwise. */ 
   block_t block_start;       /* location of master inode. */ 
   block_t inode_map_start;   /* map of free inodes start. */ 
   block_t inode_map_end;     /* map of free inodes end. */ 
   block_t bmap_map_start;    /* map of free blocks start. */ 
   block_t bmap_map_end;      /* map of free blocks end. */ 
   block_t inode_start;       /* start of inodes. */ 
   block_t inode_end;         /* end of inodes. */ 
   block_t data_start;        /* start of data. */ 
   block_t data_end;          /* end of data. */ 
   block_t imap_ptr;          /* what inode map block on disk. */ 
   block_t imap_bit;          /* what inode map bit. */ 
   block_t bmap_ptr;          /* what bit map block on disk. */ 
   block_t bmap_bit;          /* what bit map bit. */ 
   block_t inodes;            /* number of inodes. */ 
   block_t blocks;            /* number of blocks. */ 
   char *imap;            /* Created on the fly and loaded. */ 
   char *bmap;            /* Created on the fly and loaded. */ 
   block_t dev;      /* Device we are on. */ 
   char pad[MNODE_PAD];   /* padding to fit page size. */ 
} master_inode_t;

The master_inode also known as a root inode and describes the layout of the file system on disk. It indicates where the inodes start, where the data nodes start, and where the bitmaps start that mark which blocks are free or in use. The next and most important structure is the inode structure which is a C structure defined as follows: 

C++
typedef struct inode {
   char user_read;
   char user_write;
   char user_execute;

   char group_read;
   char group_write;
   char group_execute;
   
   char world_read;
   char world_write;
   char world_execute;
   
   char is_directory; /* If this is a directory, 
                       * all blocks point to other inodes,
                       including next. */
   char is_device;    /* If this is a device, call device
                      driver subsystem. */
   char is_file;      /* If this is a file, all blocks point
                      to data, including next. */
   char is_symlink;   /* If this is a symlink.  */
   char is_hardlink;  /* If this is a hardlink. */

   block_t  create_time;   /* Time of last status change. */
   block_t  modified_time; /* Time of last modification.  */
   block_t  accessed_time; /* Time of last access. */

   block_t  self;
   block_t  parent;

   /* The following are used for implementing files:
    * size      (sizeof file)
    * pos       (position in the file)
    * current   (current block number of block we are in)
    * dev       (device we are on)
    */ 
   inode_ptr_t  size; /* Size of the file in 64 bits. */

   inode_ptr_t  pos;  /* Where in the file we are currently.*/

   // NOTE - These fields must be initialized on open of the file.
   block_t      current; /* Current data block. */
   block_t      current_parent; /* Block map to which current is stored in. */
   int          iblock; /* Where in the current_parent we are. */
   int              dev; /* Device we are on. */
   inode_perm_t  o_mode; /* File open mode. */
   // The above fields are needed for implementing files.

   inode_group_t group; /* Group where we belong. */
   inode_own_t   owner; /* Who owns this file. */

   block_t refcount;    /* Reference count. */

   char path[MAX_PATH];
   char pad[INODE_PAD];
   block_t next; /* Map to blocks in case of a regular file.*/
} inode_t;

The inode structure like the master_inode are 512 bytes each. The inode describes a file or directory entry point. If it is a file, the next pointer points to data blocks and if it is a directory, the next pointer points to other inodes. In both cases, the next pointer is pointing to a C structure called block_map which contains 127 block pointers and one next pointer. The block pointers can be data or inodes. All blocks are fixed size 512 bytes. The block_map structure is defined as: 

C++
typedef struct block_map {
   block_t next;
   block_t blocks[BMAP_BLOCKS];
} block_map_t;

Symbolic links are implemented as a single 512 byte block which is pointed to by the next pointer in the inode and containing a path of another file or directory. Hardlinks are implemented similarly but have a reference count associated with them. The structure for a link is defined as: 

C++
typedef struct link {
  char path[MAX_PATH];
  char pad[MAX_PATH];
} link_t;

The most important functions in the implementation are inode_create, inode_get, and inode_free. These functions along with other file system functions make use of the block buffer cache. The inode_create function creates files, directories, symbolic links, and hard links. The inode_get function retrieves an inode representing either a file, directory, symbolic link, or hard link from the file system. The inode_free function releases the resources of a file, directory, symbolic link, or hard link. In the case of directories, the inode_free function will fail if the directory is not empty.

Paths in the system are represented using UNIX style paths where the path separator is the “/” forward slash character. The root of the file system is represented as a single forward slash. The current directory is represented by the special character “.”. The parent directory is represented by the special characters “..”.

The implementation was done using a layered approach where the most basic components were implemented first. All components were tested using test drivers written in C and are a part of their respective .c implementation files. The most low-level and basic modules are bitmap.c for implementing bitmaps, paths.c for UNIX path manipulation, and dev.c for reading from disk or writing out to disk a specified block. The next level is the block buffer cache which is contained in block.c. The heart of the system is then inode.c which implements all inode functionality. The file system system calls that refer to a file are in file.c. These include familiar open, read, write, close, and lseek calls. Directory system calls are in dir.c. Link system calls are in link.c. A utility module called krealpath.c provides a function to resolve relative paths to absolute ones as all functions actually work on absolute paths. In order for the system to work in user space, some kernel-like structures are needed and these are contained in compat.h and compat.c.

Using the Code

The code can not compile and run in user space without prefixing the system calls. Without a prefix, it is likely that the local implementation of the system calls will actually resolve the underlying kernel's. Therefore, the letter “k” is prefixed to all calls and the external interface is accessible in C programs using fs_syscalls.h. For example, if you wish to open a file, then you would use the kopen call instead of the open system call. Before actually using the file system, you need to initialize it. The code to do this is as follows: 

C++
   // ...
    #include "bool.h" 
    #include "paths.h"
    #include "block.h" 
    #include "dev.h" 
    #include "inode.h" 

    #ifdef _TEST_INC 
    #include <stdio.h> 
    #include <string.h> 
    #define printk printf 
    #include "compat.h" 
    #endif 

    #include "file.h" 
    #include "dir.h" 
    #include "link.h" 
 
    // Obtains the device based on block_open -> dev_open 
    // Initialization here is similar to what would happen 
    // in the real kernel. We must make the file system first. 
    if(inode_dev_open("./inode.dat",&dev) == INODE_INIT) { 
       if(inode_mkfs("./inode.dat", 2 * TWOMEG) != INODE_OK) { 
            printk("fs_init:: error initializing filesystem\n"); 
            return 1; 
       } 
       if(inode_dev_open("./inode.dat", &dev) != INODE_OK) { 
            printk("fs_init:: error opening device\n"); 
            return 1; 
       } 
    } 
    master = master_get(dev); 
    master_set_dev("./inode.dat", dev);

The code first checks if the file system can be opened in the call to inode_dev_open, if not, then it makes the file system using the call inode_mkfs. Once the file system is created, a second call to inode_dev_open is needed to open the file system. Afterwords, the master inode needs to be initialized and the calls to master_get and master_set_dev do that. After these calls, the file system can be used. For example: 

C++
if((fd = kopen("/foo", O_CREAT | O_RDWR)) == -1) {
        printk("error opening file\n");
}
// Ok to read, write fd file descriptor.

Points of Interest

The system is part of hobbyist operating system that is being developed. Users interested in working on an open source kernel intended for embedded and experimental use are encouraged to contact the author. As it stands, the file system was tested using test drivers that are part of the respective C source files and test a specific module. The accompanying Makefile builds various test programs to test out the code. Since no user has used the code, the code should be considered as a beta version. Users are encouraged to use the code and improve on it.

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)