Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Virtual (flat) file system in C++

Essentially I need to implement a program to act as a user space file system that implements very simple operations such as viewing what is on the disk, copying files to and from the native file system to my file system ( which is contained in a single file called "disk01") and deleting files from my file system.

I'm basically looking for a springboard or some hint as to where I can start as I'm unsure how to create my own "disk" and put other files inside it and this is a homework assignment.

Just a C++ student looking for some direction.

Edit:

I know this is a concept that is already used in several different places as "VFS's" or virtual file systems, kind of like zip files (you can only view the contents through a program that can handle zip files). I'm basically trying to write my own program similar to zip or winrar or whatever, but not near as complicated and feature rich.

Thank you for your suggestions so far! You all are a tremendous help!

like image 848
Jason M. Avatar asked Nov 21 '12 04:11

Jason M.


2 Answers

It's very simple to create a FAT-like disk structure.

At a fixed location in the file, most likely first, you have a structure with some information about the disk. Then follows the "FAT", a table of simple structures detailing the files on the disk. This is basically a fixed-size table of structure similar to this:

struct FATEntry
{
    char      name[20];  /* Name of file */
    uint32_t  pos;       /* Position of file on disk (sector, block, something else) */
    uint32_t  size;      /* Size in bytes of file */
    uint32_t  mtime;     /* Time of last modification */
};

After this table you have a fixed-sized area used for a bitmap of free blocks on the disk. If the filesystem can grow or shrink dynamically then the bitmap might not be needed. This is then followed by the actual file data.

For a system like this, all files must be continuously laid out on the disk. This will lead to fragmenting as you add and remove and resize files.


Another way is to use the linked-list approach, used for example on the old Amiga filesystems. Using this scheme all blocks are simply linked lists.

Like before you need a structure for the actual disk data, and possibly a bitmap showing free/allocated disk blocks. The only field needed in the disk-data structure is a "pointer" to the first file.

By pointer I mean an integer pointing out the location on disk of a block.

The files themselves can be similar to the above FAT-like system:

struct FileNode
{
    char     name[12];  /* Name of file */
    uint32_t next;      /* Next file, zero for last file */
    uint32_t prev;      /* Previous file, zero for first file */
    uint32_t data;      /* Link to first data block */
    uint32_t mtime;     /* Last modification time */
    uint32_t size;      /* Size in bytes of the file */
};

The data blocks are themselves linked lists:

struct DataNode
{
    uint32_t next;  /* Next data block for file, zero for last block */
    char data[BLOCK_SIZE - 4];  /* Actual data, -4 for the block link */
};

The good thing about a linked-list filesystem is that it will never be fragmented. The drawbacks are that you might have to jump all over the disk to fetch data blocks, and that the data blocks can't be used in full as they need at least one link to the next data block.


A third way, common in Unix-like systems, is to have the file data contain a set of links to the data-blocks. Then the data blocks doesn't have to be stored contiguously on the disk. It will include some of the same drawbacks as the linked-list method, in that blocks could be stored all over the disk, and that the maximum size of the file is limited. One pro is that the data-blocks can be fully utilized.

Such a structure could look like

struct FileNode
{
    char name[16];      /* Name of file */
    uint32_t size;      /* Size in bytes of file */
    uint32_t mtime;     /* Last modification time of file */
    uint32_t data[26];  /* Array of data-blocks */
};

The above structure limits the maximum file-size to 26 data blocks.

like image 170
Some programmer dude Avatar answered Sep 27 '22 19:09

Some programmer dude


Open a file for non-destructive read/write. With an fstream this might be fstream stream(filename).

Then use the seek functions to move around it. If you are using a C++ fstream this is stream.seekg(position).

Then you'd want binary read and write functions so you'd use stream.read(buffer, len) and stream.write(buffer, len).

An easy way to start a filesystem is to decide on a block size. Most people used 512 bytes in the old days. You could do that or use 4K or make it completely adjustable. Then you set aside a block near the beginning for a free-space map. This can be a bit per block or if you are lazy one byte per block. Then after that you have a root directory. FAT did this an easy way: it's just a list of names, some meta data like timestamps, file size and block offsets. I think FAT blocks had a pointer to the next block in the file so that it could fragment the file without needing to run a defrag while writing.

So then you search the directory, find the file, go to the offset and read the blocks.

Where real filesystems get complicated is in hard tasks like allocating blocks for files so they have room to grow at the ends without wasting space. Handling fragmentation. Having good performance while multiple threads or programs are writing at the same time. Robust recovery in the face of unexpected disk errors or power loss.

like image 23
Zan Lynx Avatar answered Sep 27 '22 21:09

Zan Lynx