A distributed file system designed with reference to HDFS architecture, supporting basic operations such as file read/write/delete and directory management. Metadata is stored in NameNode, files are split into blocks and stored across multiple DataNodes with multi-replica redundancy.
Option A: IDE Debug
# In GoLand, set Run Kind to Package
# Program Arguments:
-conf=nameNode/conf/nameNode_config.ymlOption B: Run from Source
cd TrainFS/nameNode
go run nameNode_ctrl.go -conf=./conf/nameNode_config.ymlOption C: Build and Run
cd TrainFS/nameNode
go build -o ./build/nameNode
./build/nameNodeOption A: IDE Debug
# Program Arguments:
-id=1 -port=9001 -conf=dataNode/conf/dataNode_config.ymlOption B: Run from Source
cd TrainFS/dataNode
go run dataNode_ctrl.go -id=1 -port=9001 -conf=./conf/dataNode_config.yml
go run dataNode_ctrl.go -id=2 -port=9002 -conf=./conf/dataNode_config.yml
go run dataNode_ctrl.go -id=3 -port=9003 -conf=./conf/dataNode_config.ymlOption C: Build and Run
cd TrainFS/dataNode
go build -o ./build/dataNode
./build/dataNode -id=1 -port=9001
./build/dataNode -id=2 -port=9002
./build/dataNode -id=3 -port=9003cd TrainFS/client
go test -run TestPutFile
go test -run TestGetFile
go test -run TestDeletego get -u google.golang.org/protobuf/proto@latest
go get -u google.golang.org/protobuf/protoc-gen-go@latest
go get -u google.golang.org/grpc/protoc-gen-go-grpc@latest
cd profile
protoc --go_out=. --go-grpc_out=. ./*.protoUpload a local file to the distributed file system.
PutFile("/home/user/test.txt", "/app")Flow:
- Client requests upload from NameNode, gets DataNode address chain
- File is split into multiple Chunks by configured size (e.g.,
test.txt_chunk_0,test.txt_chunk_1) - Client sends Chunks to the first DataNode, nodes forward in chain
- DataNode saves data and commits metadata to NameNode
- Client confirms upload completion
Download a file from the distributed file system to local.
GetFile("/app/test.txt", "/home/user")Flow: Query NameNode for block locations, retrieve Chunks from DataNodes and merge.
Create a remote directory.
Mkdir("/home/user")Delete a file or empty directory.
DeleteFile("/home/user/test.txt")
DeleteFile("/home/user") // Only supports empty directoriesFlow: NameNode deletes metadata, sends delete tasks to DataNodes via heartbeat.
List directory contents.
ListDir("/home/user")Rename an empty directory.
ReName("/home/user", "/home/newUser")Manages file system metadata, maintaining two core mappings:
<FilePath, [Chunk1, Chunk2, ...]>- File to blocks mapping<ChunkName, [DataNode1, DataNode2, ...]>- Block to replica locations mapping
Responsibilities:
- Receive DataNode registration and heartbeat, manage node status
- Process block commits, update metadata mappings
- Detect node failures, trigger replica rebalancing tasks
- Dispatch delete/replication tasks via heartbeat responses
Handles actual data block storage and transfer.
Responsibilities:
- Register with NameNode on startup, report disk space
- Report all stored Chunk information
- Send periodic heartbeats, execute dispatched tasks
- Receive and store Chunks, forward to downstream nodes in chain
- Optimize Metadata Structure: Refine lock granularity, better support for ReName operations
- NameNode High Availability: Upgrade to sharded distributed cluster based on consensus algorithm
- Async Commit Optimization: DataNode receives and forwards data first, then batch commits to NameNode