Whitee is a tiny compiler written in C++17, which translates SysY language into ARM-v7a assembly.
Whitee is the final work of team Good Morning! Whitegivers with a wild card in 2021 Huawei Bisheng Cup Compilation System Design Competition, which ranked 5th in the final and won the 2nd prize.
- SysY language: a subset of C language, whose source code is usually stored in a file with the extension sy. Only one main function is allowed in SysY, which supports global variable declaration, constant declaration, other function definitions, etc. For more details see SysY-Language-Definition.pdf.
- SysY runtime library: the library provides a series of I/O functions and timing functions. For more details see SysY-Runtime-Library.pdf.
- target platform: Raspberry 4B with Raspberry Pi OS(Raspbian GNU/Linux 10) and ARM Cortex-A72 CPU.
- Make 4.2+
- CMake 3.13+
- GCC 9.3+ or other compilers supporting C++17
$ git clone https://github.com/Forever518/Whitee.git
$ mkdir Whitee/build/ && cd Whitee/build/
$ cmake .. && make -j 8$ ./whitee [-S] [-o] [-h | --help] [-d | --debug <level>] [-c | --check <level>] [--set-debug-path=<path>] <target-file> <source-file> [-O <level>]-S: generate assembly, can be omitted.-o: set output file, can be omitted.-h,--help: show usage.-d <level>,--debug <level>: dump debug messages to certain files.- level 1: dump IR and optimized IR.
- level 2: append AST.
- level 3: append Lexer, each Optimization Pass and Register Allocation.
-c <level>,--check <level>: check IR's relation.- level 1: check IR and final optimized IR only.
- level 2: check IR after each optimization pass.
--set-debug-path=<path>: set debug messages output path, default the same path with target assembly file.<target-file>: target assembly file in ARM-v7a.<source-file>: source code file matching SysY grammar.-O <level>: set optimization level, default non-optimization-O0.
- When the debug-path is not set and the option
-dor--debugis used, the compiler will dump debug messages to the same directory with target assembly file aswhitee-debug-<target-file>. Each message is dumped to its target file:- IR:
ir.txt - AST:
ast.txt - FIR:
ir_final.txt - optimized IR:
ir_optimized.txt - lexer vector:
lexer.txt - conflict graph of variables in FIR(taking id in IR as identification):
ir_conflict_graph.txt - result of global register allocation(taking id in IR as identification):
ir_register_alloc.txt - IR after each pass: directory
optimizeunder debug-path.
- IR:
- The option
-cor--checkcan make the compiler check IR's realtion after IR successfully built or each optimized pass, helping developers find more problems and bugs. - The option
-Owill set the optimization level.-O0: only constant propagation, copy propagation and temporary register allocation.-O1: append dead code elimination, constant folding, assembly peephole optimization, etc.-O2: append multiplication & division optimization, function inlining, etc.-O3: append local array propagation, constant array globalization, etc.
- The target assembly directory or the debug path will be created when not exists.
- Strength Reduction
- Simple Loop Unrolling
- Constant Propagation
- Copy Propagation
- Loop Structure Simplification
- Dead Code Elimination
- Constant Folding(including Peephole Optimization)
- Function Inlining Conditionally
- Local Common Subexpression Elimination
- Loop Invariant Code Motion
- Advanced Dead Code Elimination
- Basic Block Merging and Elimination
- Read-only Global Variable or Array to Constant
- Local Array Propagation
- Write-only Global Variable or Array Elimination
- Global Register Allocation based on Graph Coloring
- Constant Multiplication & Division Simplification
- Assembly Peephole Optimization
All 4 members of team Good Morning! Whitegivers.
GPL-3.0 License © Dihao Fan, Zhenwei Liu, Weikun Peng, Kelun Lei