TrainFS

English · 简体中文

TrainFS

A distributed file system designed with reference to HDFS architecture, supporting basic operations such as file read/write/delete and directory management. Metadata is stored in NameNode, files are split into blocks and stored across multiple DataNodes with multi-replica redundancy.

Quick Start

1. Start NameNode

Option A: IDE Debug

# In GoLand, set Run Kind to Package
# Program Arguments:
-conf=nameNode/conf/nameNode_config.yml

Option B: Run from Source

cd TrainFS/nameNode
go run nameNode_ctrl.go -conf=./conf/nameNode_config.yml

Option C: Build and Run

cd TrainFS/nameNode
go build -o ./build/nameNode
./build/nameNode

2. Start DataNode (Multiple Nodes)

Option A: IDE Debug

# Program Arguments:
-id=1 -port=9001 -conf=dataNode/conf/dataNode_config.yml

Option B: Run from Source

cd TrainFS/dataNode
go run dataNode_ctrl.go -id=1 -port=9001 -conf=./conf/dataNode_config.yml
go run dataNode_ctrl.go -id=2 -port=9002 -conf=./conf/dataNode_config.yml
go run dataNode_ctrl.go -id=3 -port=9003 -conf=./conf/dataNode_config.yml

Option C: Build and Run

cd TrainFS/dataNode
go build -o ./build/dataNode
./build/dataNode -id=1 -port=9001
./build/dataNode -id=2 -port=9002
./build/dataNode -id=3 -port=9003

3. Client Testing

cd TrainFS/client
go test -run TestPutFile
go test -run TestGetFile
go test -run TestDelete

4. Regenerate Protobuf (Optional)

go get -u google.golang.org/protobuf/proto@latest
go get -u google.golang.org/protobuf/protoc-gen-go@latest
go get -u google.golang.org/grpc/protoc-gen-go-grpc@latest

cd profile
protoc --go_out=. --go-grpc_out=. ./*.proto

System Architecture

API Reference

PutFile(localFilePath, remotePath)

Upload a local file to the distributed file system.

PutFile("/home/user/test.txt", "/app")

Flow:

Client requests upload from NameNode, gets DataNode address chain
File is split into multiple Chunks by configured size (e.g., test.txt_chunk_0, test.txt_chunk_1)
Client sends Chunks to the first DataNode, nodes forward in chain
DataNode saves data and commits metadata to NameNode
Client confirms upload completion

GetFile(remoteFilePath, localPath)

Download a file from the distributed file system to local.

GetFile("/app/test.txt", "/home/user")

Flow: Query NameNode for block locations, retrieve Chunks from DataNodes and merge.

Mkdir(remoteDirPath)

Create a remote directory.

Mkdir("/home/user")

DeleteFile(remotePath)

Delete a file or empty directory.

DeleteFile("/home/user/test.txt")
DeleteFile("/home/user")  // Only supports empty directories

Flow: NameNode deletes metadata, sends delete tasks to DataNodes via heartbeat.

ListDir(remoteDirPath)

List directory contents.

ListDir("/home/user")

ReName(oldPath, newPath)

Rename an empty directory.

ReName("/home/user", "/home/newUser")

Core Components

NameNode (Metadata Center)

Manages file system metadata, maintaining two core mappings:

<FilePath, [Chunk1, Chunk2, ...]> - File to blocks mapping
<ChunkName, [DataNode1, DataNode2, ...]> - Block to replica locations mapping

Responsibilities:

Receive DataNode registration and heartbeat, manage node status
Process block commits, update metadata mappings
Detect node failures, trigger replica rebalancing tasks
Dispatch delete/replication tasks via heartbeat responses

DataNode (Data Node)

Handles actual data block storage and transfer.

Responsibilities:

Register with NameNode on startup, report disk space
Report all stored Chunk information
Send periodic heartbeats, execute dispatched tasks
Receive and store Chunks, forward to downstream nodes in chain

Future Plans

Optimize Metadata Structure: Refine lock granularity, better support for ReName operations
NameNode High Availability: Upgrade to sharded distributed cluster based on consensus algorithm
Async Commit Optimization: DataNode receives and forwards data first, then batch commits to NameNode

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
clearDB		clearDB
client		client
common		common
dataNode		dataNode
doc		doc
nameNode		nameNode
profile		profile
testFile		testFile
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TrainFS

Quick Start

1. Start NameNode

2. Start DataNode (Multiple Nodes)

3. Client Testing

4. Regenerate Protobuf (Optional)

System Architecture

API Reference

PutFile(localFilePath, remotePath)

GetFile(remoteFilePath, localPath)

Mkdir(remoteDirPath)

DeleteFile(remotePath)

ListDir(remoteDirPath)

ReName(oldPath, newPath)

Core Components

NameNode (Metadata Center)

DataNode (Data Node)

Future Plans

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

kebukeYi/TrainFS

Folders and files

Latest commit

History

Repository files navigation

TrainFS

Quick Start

1. Start NameNode

2. Start DataNode (Multiple Nodes)

3. Client Testing

4. Regenerate Protobuf (Optional)

System Architecture

API Reference

PutFile(localFilePath, remotePath)

GetFile(remoteFilePath, localPath)

Mkdir(remoteDirPath)

DeleteFile(remotePath)

ListDir(remoteDirPath)

ReName(oldPath, newPath)

Core Components

NameNode (Metadata Center)

DataNode (Data Node)

Future Plans

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages