Thanks to visit codestin.com
Credit goes to lib.rs

2 releases

Uses new Rust 2024

0.1.1 Oct 24, 2025
0.1.0 Oct 24, 2025

#223 in Programming languages

Apache-2.0

40KB
804 lines

Umjunsik Lang - Lamina

A compiler for Umjunsik Language (엄랭) that targets Lamina IR.

Overview

This project compiles Umjunsik Language to Lamina Intermediate Representation. Lamina is a high-performance compiler backend supporting x86_64 and AArch64 architectures.

Building

cargo build --release

Usage

cargo run -- <file.umm>

Or use the compiled binary:

./target/release/umjunsik <file.umm>

Examples

Hello World (Print number 3)

examples/hello.umm:

어떻게
식...!
이 사람이름이냐ㅋㅋ

Multiplication (2 * 2 = 4)

examples/multiply.umm:

어떻게
식.. ..!
이 사람이름이냐ㅋㅋ

Variables

examples/variable.umm:

어떻게
엄...
엄어어....
식어어!
이 사람이름이냐ㅋㅋ

Explanation:

  1. ... : Assign 3 to variable 1
  2. 엄어어.... : Assign 4 to variable 2
  3. 식어어! : Print variable 2 (outputs 4)

Character Output

examples/printchar.umm:

어떻게
식........... .......ㅋ
식ㅋ
이 사람이름이냐ㅋㅋ

Explanation:

  1. ........... ....... : Print ASCII character 18
  2. 식ㅋ : Print newline

Conditional

examples/conditional.umm:

어떻게
엄...
동탄어?.....!..!
이 사람이름이냐ㅋㅋ

Explanation:

  1. ... : Assign 3 to variable 1
  2. 동탄어?.....! : If variable 1 is not 0, print 5
  3. ..! : Print 2

Implementation Details

Lexer

The lexer handles Korean characters and special tokens:

  • Recognizes keywords: 어떻게, , , , , 동탄, 화이팅, 이 사람이름이냐ㅋㅋ
  • Handles repeated for variable indexing (e.g., 어어어 = variable 3)
  • Processes dots (.) and commas (,) as number literals
  • Supports one-line programs with ~ separator

Parser

Builds an Abstract Syntax Tree (AST) with:

  • Expression nodes: Number, Variable, Add, Sub, Mul
  • Statement nodes: Assign, Input, Print, Conditional, Goto, Return
  • Handles operator precedence (multiplication before addition)

Code Generator

Generates optimized Lamina IR:

  • Two-pass compilation: First pass analyzes which variables are used, second pass generates code
  • Lazy variable allocation: Only allocates stack space for variables that are actually used in the program
  • SSA form: Uses Static Single Assignment with proper load/store operations
  • Memory management: Variables stored on stack using alloc.ptr.stack, initialized to 0
  • Control flow: Generates proper basic blocks for conditionals and goto statements

Example output for a simple variable program:

fn @main() -> i64 {
  entry:
    %var_ptr_0 = alloc.ptr.stack i64  # Only allocate used variables
    store.i64 %var_ptr_0, 0

  line_1:
    %t0 = add.i64 3, 0
    store.i64 %var_ptr_0, %t0
    %t1 = load.i64 %var_ptr_0
    print %t1
    ret.i64 0
}

Implementation Status

Implemented

  • ✅ Lexer/Tokenizer with Korean character support
  • ✅ Parser (AST generation)
  • ✅ Optimized Lamina IR code generation
  • ✅ Basic arithmetic operations (add, subtract, multiply)
  • ✅ Smart variable management (lazy allocation, only used variables)
  • ✅ Console output (numbers and characters)
  • ✅ Newline output
  • ✅ Conditionals (동탄)
  • ✅ GOTO (준)
  • ✅ Program exit (화이팅!)

Not Implemented

  • ❌ Console input (식?) - Placeholder only (requires external function call)
  • ❌ Full compilation to native code (generates Lamina IR only)

Project Structure

umjunsik-lang-lamina/
├── src/
│   ├── main.rs          # Main executable
│   ├── lib.rs           # Library root
│   ├── token.rs         # Token definitions
│   ├── lexer.rs         # Lexer (tokenization)
│   ├── ast.rs           # Abstract syntax tree
│   ├── parser.rs        # Parser
│   └── codegen.rs       # Lamina IR code generator
├── examples/            # Example programs
├── Cargo.toml
└── README.md

License

Apache License 2.0

References

  • Lamina - High-performance compiler backend

Dependencies

~4MB
~39K SLoC