2 releases
Uses new Rust 2024
| 0.1.1 | Oct 24, 2025 |
|---|---|
| 0.1.0 | Oct 24, 2025 |
#223 in Programming languages
40KB
804 lines
Umjunsik Lang - Lamina
A compiler for Umjunsik Language (엄랭) that targets Lamina IR.
Overview
This project compiles Umjunsik Language to Lamina Intermediate Representation. Lamina is a high-performance compiler backend supporting x86_64 and AArch64 architectures.
Building
cargo build --release
Usage
cargo run -- <file.umm>
Or use the compiled binary:
./target/release/umjunsik <file.umm>
Examples
Hello World (Print number 3)
examples/hello.umm:
어떻게
식...!
이 사람이름이냐ㅋㅋ
Multiplication (2 * 2 = 4)
examples/multiply.umm:
어떻게
식.. ..!
이 사람이름이냐ㅋㅋ
Variables
examples/variable.umm:
어떻게
엄...
엄어어....
식어어!
이 사람이름이냐ㅋㅋ
Explanation:
엄...: Assign 3 to variable 1엄어어....: Assign 4 to variable 2식어어!: Print variable 2 (outputs 4)
Character Output
examples/printchar.umm:
어떻게
식........... .......ㅋ
식ㅋ
이 사람이름이냐ㅋㅋ
Explanation:
식........... .......ㅋ: Print ASCII character 18식ㅋ: Print newline
Conditional
examples/conditional.umm:
어떻게
엄...
동탄어?식.....!
식..!
이 사람이름이냐ㅋㅋ
Explanation:
엄...: Assign 3 to variable 1동탄어?식.....!: If variable 1 is not 0, print 5식..!: Print 2
Implementation Details
Lexer
The lexer handles Korean characters and special tokens:
- Recognizes keywords:
어떻게,엄,어,준,식,동탄,화이팅,이 사람이름이냐ㅋㅋ - Handles repeated
어for variable indexing (e.g.,어어어= variable 3) - Processes dots (
.) and commas (,) as number literals - Supports one-line programs with
~separator
Parser
Builds an Abstract Syntax Tree (AST) with:
- Expression nodes: Number, Variable, Add, Sub, Mul
- Statement nodes: Assign, Input, Print, Conditional, Goto, Return
- Handles operator precedence (multiplication before addition)
Code Generator
Generates optimized Lamina IR:
- Two-pass compilation: First pass analyzes which variables are used, second pass generates code
- Lazy variable allocation: Only allocates stack space for variables that are actually used in the program
- SSA form: Uses Static Single Assignment with proper load/store operations
- Memory management: Variables stored on stack using
alloc.ptr.stack, initialized to 0 - Control flow: Generates proper basic blocks for conditionals and goto statements
Example output for a simple variable program:
fn @main() -> i64 {
entry:
%var_ptr_0 = alloc.ptr.stack i64 # Only allocate used variables
store.i64 %var_ptr_0, 0
line_1:
%t0 = add.i64 3, 0
store.i64 %var_ptr_0, %t0
%t1 = load.i64 %var_ptr_0
print %t1
ret.i64 0
}
Implementation Status
Implemented
- ✅ Lexer/Tokenizer with Korean character support
- ✅ Parser (AST generation)
- ✅ Optimized Lamina IR code generation
- ✅ Basic arithmetic operations (add, subtract, multiply)
- ✅ Smart variable management (lazy allocation, only used variables)
- ✅ Console output (numbers and characters)
- ✅ Newline output
- ✅ Conditionals (동탄)
- ✅ GOTO (준)
- ✅ Program exit (화이팅!)
Not Implemented
- ❌ Console input (식?) - Placeholder only (requires external function call)
- ❌ Full compilation to native code (generates Lamina IR only)
Project Structure
umjunsik-lang-lamina/
├── src/
│ ├── main.rs # Main executable
│ ├── lib.rs # Library root
│ ├── token.rs # Token definitions
│ ├── lexer.rs # Lexer (tokenization)
│ ├── ast.rs # Abstract syntax tree
│ ├── parser.rs # Parser
│ └── codegen.rs # Lamina IR code generator
├── examples/ # Example programs
├── Cargo.toml
└── README.md
License
Apache License 2.0
References
- Lamina - High-performance compiler backend
Dependencies
~4MB
~39K SLoC