Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

tinysnake/simfs

Repository files navigation

SimFS

简体中文

SimFS is a Single File Simulated File System written in C#.

  • It's designed to store a large number of small files.
  • It runs faster and allocates less memory than the actual file system.
  • It stores all data into a single file.

Existing features:

  1. Dynamically allocate disk space.
  2. Support folders.
  3. Support customizable file extended attributes.
  4. Use Span<T> and Memory<T> to reduce memory usage.
  5. Transactions, ability to rollback changes.

Cautions:

  1. SimFS is not threadsafe, you can access SimFS's APIs in only 1 thread(It doesn't matter it's main thread or not).
  2. Do not read a file before it's done writing, otherwise exceptions will be thrown.
  3. At this version of SimFS, the changes of transactions are saved in the memory before it commits/rollbacks, so mind the memory usage before you make a massive transaction.
  4. The file size is limited by the BlockSize argument of the Filebase, read the BlockSize section for more information.
  5. The number files in a directory is limited by the BlockSize too, if you want to save a huge amount of files into one folder, make sure you categorize it and divide them into sub folders to prevent "overflow".

The cause of making SimFS

In general way, we store a user's profile into one big file, even if the user just changed a fraction amount of data to their profiles. It take time to re-serialize the entire profile, and it also take time to do file IOs. Then I thought: As long as I split the user's save files into small enough pieces, the save data generated by each user action can be small enough to be serialize and stored in a very short amount of time.

However, modern file systems have very rich functions, and operations other than reading and writing usually come with some performance overhead. This is why the speed of reading and writing a large number of small files is always slower than a one large file. So I came up with the idea of writing my own virtual file system to save IO time costs.

The data structure of SimFS combines concepts from various file systems. In terms of performance, it aims to match GameFramework's VFS. Currently, this project has reached a good state. Although there is still much room for improvement, due to time constraints, I will first optimize its stability and consider adding new features later.

Usage

using SimFS;

var blockSize = 1024;
var attributeSize = 0;
var bufferSize = 8196;

var filebase = new Filebase("/path/to/file", blockSize, attributeSize, bufferSize);
// var filebase = new Filebase(File.Open("/path/to/file"), blockSize, attributeSize, bufferSize);

using var fs = filebase.OpenFile("some/file", OpenFileMode.OpenOrCreate);
// fs inherits from System.IO.Stream

filebase.WriteAllText("some/file", "abc");
filebase.WriteAllLines("some/file", new []{ "abc", "def"});
filebase.WriteAllBytes("some/file", new byte[] {0, 1, 2, 3});
filebase.AppendAllText(...);
filebase.AppendAllLines(...);
var bytes = filebase.ReadAllBytes("some/file");
var text = filebase.ReadAllText("some/file");
var lines = filebase.ReadAllLines("some/file");
byte[] attr = filebase.ReadFileAttributes("some/file");

SimFileInfo fi = filebase.GetFileInfo("some/file");

var files = filebase.GetFiles("some/directory", PathKind.Relative, topDirectoryOnly: true);
var dirs = filebase.GetFiles("some/directory", PathKind.Relative, topDirectoryOnly: true);

if(filebase.Exists("some/file", SimFSType.File)) { }
if(filebase.Exists("some/directory", SimFSType.Directory)) { }
if(filebase.Exists("some/path", SimFSType.Any)) { }

filebase.Move("some/file1", "some/file2");
filebase.Copy("some/file1", "some/file2");
filebase.Delete("some/file");

filebase.CreateDirectory("some/dir");
filebase.CreateParentDirectory("some/dir/file");

filebase.Dispose();

The Most Important Variable: BlockSize

The BlockSize parameter is crucial when creating a FileBase object, and it cannot be modified after creation. Its size affects many aspects of SimFS. The following table elaborates on these aspects:

BlockSize 128 256 512 1024 2048 4096
Block Group Size 128KB 512KB 2048KB 8192KB 32MB 128MB
Blocks per GB* 1024 2048 4096 8192 16384 32768
Inodes per GB* 1024 2048 4096 8192 16384 32768
File Size Limit 95KB 191KB 382KB 1785KB 3570KB 7140KB

GB stands for BlockGroup

SimFS restricts the value of BlockSize to be a power of 2 and between 128 and 4096 (values outside this range are almost meaningless).

For those unfamiliar with the concepts of BlockGroup and Inode, you can focus on the file size limit. For general users, 1024 as the BlockSize is the most versatile choise.

Customizable File Attributes

Since SimFS is a minimalist virtual file system, it does not have information such as creation time, modification time, access time, and permissions that are common in general file systems. I believe it is necessary to obtain some key metadata information before opening a file. Therefore, the Customizable File Attribute feature is designed.

Due to design reasons, the Attributes data is stored together with the Inode information, so its size is fixed and should not be set too large. The current maximum limit is 32B. According to the above table, when the size of Attributes is 32B and the BlockSize is 1024, each BlockGroup will occupy an additional 128KB (8192 * 32B) of disk space.

We only need to add the attributeSize parameter when creating a Filebase (and it cannot be modified later too): var filebase = new Filebase("/path/to/file", attributeSize: 32);

Reading and writing Attributes:

byte[] attrBytes = filebase.GetFileAttributes("some/file");
SimFileInfo fi = filebase.GetFileInfo("some/file");
ReadOnlySpan<byte> attrBytes1 = fi.Attributes;

// We can change Attributes by opening the file:
using var fs = fi.Open();
fs.WriteAttribute(new byte[] {1, 2, 3, 4});

var buffer = new byte[32];
fs.ReadAttribute(buffer);

Performance

This is a relatively rigorous test run on a Windows machine using BenchmarkDotNet in the net8.0 environment:

// * Summary *

BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4317/23H2/2023Update/SunValley3)
12th Gen Intel Core i7-12700, 1 CPU, 20 logical and 12 physical cores
.NET SDK 8.0.403
  [Host]     : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2 [AttachedDebugger]
  DefaultJob : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2
Method tester Mean Error StdDev Gen0 Allocated
RenameData HostFSTest NA NA NA NA NA
RenameData GameFramworkTest 1,157.3 ns 8.37 ns 7.83 ns - -
RenameData SimFSTest 12,321.9 ns 112.23 ns 104.98 ns - -
FillData HostFSTest 9,659,176.5 ns 273,644.56 ns 806,847.22 ns - 36810 B
FillData GameFramworkTest 432,373.4 ns 5,458.10 ns 5,105.51 ns 0.4883 11726 B
FillData SimFSTest 184,853.6 ns 826.91 ns 733.03 ns - -
ReadData HostFSTest 1,288,943.2 ns 8,672.89 ns 7,688.29 ns 5.8594 83203 B
ReadData GameFramworkTest 97,419.0 ns 463.14 ns 410.56 ns - -
ReadData SimFSTest 132,595.6 ns 586.29 ns 519.73 ns - -
DeleteData HostFSTest 231,807.6 ns 943.01 ns 882.10 ns 2.1973 29600 B
DeleteData GameFramworkTest 587.9 ns 3.55 ns 3.15 ns - -
DeleteData SimFSTest 4,849.4 ns 42.58 ns 37.74 ns - -

In addition, I also conducted informal tests on different mobile devices using the Unity engine:

Environment: Unity 2022.4.33f1 - Release ScriptBackend: IL2CPP

Test Device: XiaoMi MI5

Method tester Mean
FillData-FirstTime HostFSTest 92 ms
FillData-FirstTime GameFramworkTest 304 ms
FillData-FirstTime SimFSTest 216 ms
FillData HostFSTest 65 ms
FillData GameFramworkTest 26 ms
FillData SimFSTest 7 ms
ReadData HostFSTest 59 ms
ReadData GameFramworkTest 24 ms
ReadData SimFSTest 34 ms
DeleteData HostFSTest 24 ms
DeleteData GameFramworkTest 15 ms
DeleteData SimFSTest 2 ms

Test Device: Google Pixel 5

Method tester Mean
FillData-FirstTime HostFSTest 27 ms
FillData-FirstTime GameFramworkTest 127 ms
FillData-FirstTime SimFSTest 88 ms
FillData HostFSTest 20 ms
FillData GameFramworkTest 7 ms
FillData SimFSTest 2 ms
ReadData HostFSTest 17 ms
ReadData GameFramworkTest 1 ms
ReadData SimFSTest 2 ms
DeleteData HostFSTest 8 ms
DeleteData GameFramworkTest 5 ms
DeleteData SimFSTest 1 ms

DataStructure

You can refer to the dedicated section: DataStructure

What's more

The following features still need to be improved:

  • Delayed disk space allocation.
  • Rollback operations in case of exceptions to prevent file system corruption.
  • Fragmentation defragmentation function.

About

A Simulated File System written in C#. it's born for storing a large amount of small files.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages