A Clojure/ClojureScript implementation of Token-Oriented Object Notation – a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage.
TOON achieves 49% fewer tokens than formatted JSON (28% vs compact JSON) while maintaining explicit structure that helps LLMs parse and validate data reliably. It's intended for LLM input as a lossless, drop-in representation of JSON data.
Specification: This library implements TOON v3.0 specification Reference Implementation: TypeScript/JavaScript
When working with Large Language Models, token efficiency directly impacts cost, context window usage, and processing speed. LLM tokens still cost money – and standard JSON is verbose and token-expensive.
TOON's sweet spot is uniform arrays of objects – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts.
Based on benchmarks using the GPT-5 o200k_base tokenizer:
- 49.1% reduction vs formatted JSON (2-space indentation)
- 28.0% reduction vs compact JSON (minified)
- 39.4% reduction vs YAML
- 56.0% reduction vs XML
Real-world examples:
- GitHub repositories (100 items): 42.3% fewer tokens than JSON
- Daily analytics (180 days): 58.9% fewer tokens than JSON
- E-commerce orders: 35.4% fewer tokens than JSON
- 💸 Token-efficient: Eliminates redundant punctuation and repeated keys
- 🤿 LLM-friendly guardrails: Explicit lengths and fields enable validation
- 🍱 Minimal syntax: Removes braces, brackets, and most quotes
- 📐 Indentation-based: Uses whitespace like YAML instead of braces
- 🧺 Tabular arrays: Declare keys once, stream data as rows
TOON excels at:
- Uniform arrays of objects (same fields, primitive values)
- Large datasets with consistent structure
- Tabular data with multiple rows
JSON is better for:
- Non-uniform data with varying field sets
- Deeply nested structures
- Mixed-type collections
CSV is more compact for:
- Flat, uniform tables without any nesting
- Data without nested objects or arrays
com.vadelabs/toon {:mvn/version "2025.12.01-36"}[com.vadelabs/toon "2025.12.01-36"](require '[com.vadelabs.toon.core :as toon])
;; Encode Clojure data to TOON
(toon/encode {:name "Alice" :age 30 :tags ["dev" "rust"]})
;=> "name: Alice\nage: 30\ntags[2]: dev,rust"
;; Decode TOON to Clojure data
(toon/decode "name: Alice\nage: 30\ntags[2]: dev,rust")
;=> {"name" "Alice", "age" 30.0, "tags" ["dev" "rust"]}JSON:
{
"name": "Alice",
"age": 30,
"active": true
}TOON:
name: Alice
age: 30
active: true
JSON:
{
"user": {
"name": "Alice",
"email": "[email protected]"
}
}TOON:
user:
name: Alice
email: [email protected]
JSON:
{
"tags": ["reading", "gaming", "coding"]
}TOON:
tags[3]: reading,gaming,coding
This is TOON's sweet spot – uniform arrays of objects with consistent fields:
JSON:
{
"users": [
{"id": 1, "name": "Alice", "role": "admin"},
{"id": 2, "name": "Bob", "role": "user"}
]
}TOON:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
The tabular format eliminates repeated keys, providing significant token savings for large datasets.
For non-uniform data, TOON uses list format:
TOON:
items[3]:
- name: Laptop
price: 999
- name: Mouse
price: 29
- name: Keyboard
price: 79
Encodes Clojure data structures to TOON format.
(encode input)
(encode input options)Parameters:
input- Any Clojure value (normalized to JSON-compatible types)options- Optional map::indent- Spaces per indentation level (default: 2):delimiter- Array value delimiter:","(default),"\t", or"|":key-collapsing- Key collapsing mode::off(default) or:safe:flatten-depth- Max depth for key collapsing (default: Infinity):replacer- Function(fn [key value path] ...)to transform/filter values
Returns: String in TOON format
Examples:
;; Basic encoding
(encode {:name "Ada" :tags ["reading" "gaming"]})
;=> "name: Ada\ntags[2]: reading,gaming"
;; Custom delimiter
(encode {:tags ["a" "b" "c"]} {:delimiter "\t"})
;=> "tags[3\t]: a\tb\tc"
;; Tabular array format
(encode [{:id 1 :name "Alice"}
{:id 2 :name "Bob"}])
;=> "[2]{id,name}:\n 1,Alice\n 2,Bob"
;; Using replacer to filter sensitive fields
(encode {:name "Alice" :password "secret"}
{:replacer (fn [k v _] (when-not (= k "password") v))})
;=> "name: Alice"
;; Using replacer to transform values
(require '[clojure.string :as str])
(encode {:status "active"}
{:replacer (fn [k v _] (if (string? v) (str/upper-case v) v))})
;=> "status: ACTIVE"Decodes TOON format to Clojure data structures.
(decode input)
(decode input options)Parameters:
input- String in TOON formatoptions- Optional map::indent- Spaces per indentation level (default: 2):strict- Enable strict validation (default: true)
Returns: Clojure data structure (maps, vectors, primitives)
Examples:
;; Basic decoding
(decode "name: Ada\ntags[2]: reading,gaming")
;=> {"name" "Ada", "tags" ["reading" "gaming"]}
;; Tabular array
(decode "[2]{id,name}:\n 1,Alice\n 2,Bob")
;=> [{"id" 1.0, "name" "Alice"} {"id" 2.0, "name" "Bob"}]
;; Inline array
(decode "[3]: 1,2,3")
;=> [1.0 2.0 3.0]
;; Relaxed mode (allows tabs, inconsistent indentation)
(decode "name: Ada" {:strict false})
;=> {"name" "Ada"}string: Hello World
number: 42
float: 3.14
boolean: true
nil: null
Strings are quoted when they contain special characters:
comma: "a,b"
colon: "key:value"
reserved: "true"
newline: "line1\nline2"
Key-value pairs separated by colons:
name: Alice
age: 30
Nested objects use indentation:
user:
name: Alice
email: [email protected]
Inline format (primitives):
tags[3]: reading,gaming,coding
Tabular format (objects with same keys):
[3]{id,name}:
1,Alice
2,Bob
3,Carol
List format (mixed items):
items[2]:
- name: Laptop
price: 999
- name: Mouse
price: 29
Custom delimiter:
tags[3|]: a|b|c
tags[3\t]: a\tb\tc
Length marker:
items[#3]: 1,2,3
TOON normalizes Clojure types to JSON-compatible values:
- Keywords → Strings:
:name→"name" - Sets → Sorted vectors:
#{3 1 2}→[1 2 3] - All numbers → Doubles:
42→42.0 - Maps → String-keyed maps:
{:a 1}→{"a" 1.0}
# Run all Clojure tests
bb test
# Run all tests (Clojure + Babashka)
bb test:all
# Run CI pipeline with tests
bb ci
# Generate test coverage report
bb coverageThe library includes:
- 340+ unit tests with 90%+ code coverage
- Property-based tests using test.check
- Comprehensive roundtrip testing
- Edge case coverage
Coverage reports are generated in target/coverage/ including:
- HTML report:
target/coverage/index.html - Codecov JSON:
target/coverage/codecov.json
We welcome contributions! Please see CONTRIBUTING.md for:
- Development setup
- Coding guidelines
- Testing requirements
- Pull request process
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Make your changes with tests
- Run tests:
bb test - Commit with clear messages:
git commit -m "add feature X" - Push and create a pull request
This implementation follows the TOON v2.0 specification (2025-11-10).
For detailed format rules, edge cases, and conformance requirements, see:
- Full Specification - Complete technical specification
- Conformance Tests - Language-agnostic test fixtures
- Examples - Example TOON files
- Changelog - Spec version history
Detailed benchmarks comparing TOON against JSON, YAML, XML, and CSV across multiple datasets and LLM models are available in the reference implementation repository.
Key findings:
- Token efficiency: 49% fewer tokens than formatted JSON on average
- Retrieval accuracy: 70.1% (TOON) vs 65.4% (JSON) across 4 LLMs
- Best case: 58.9% reduction for uniform tabular data (daily analytics)
Token counts are measured using the GPT-5 o200k_base tokenizer. Actual savings vary by model and tokenizer.
- TypeScript/JavaScript: toon-format/toon (reference implementation)
- Python: toon-format/toon-python (in development)
- Rust: toon-format/toon-rust (in development)
- .NET: ToonSharp
- C++: ctoon
- Crystal: toon-crystal
- Dart: toon
- Elixir: toon_ex
- Gleam: toon_codec
- Go: gotoon
- Java: JToon
- Lua/Neovim: toon.nvim
- OCaml: ocaml-toon
- PHP: toon-php
- Python: python-toon
- Ruby: toon-ruby
- Swift: TOONEncoder
Note: When implementing TOON in other languages, follow the specification to ensure compatibility. The conformance tests provide language-agnostic validation.
- Conformance test suite integration
- Performance benchmarks vs JSON for Clojure
- ClojureScript browser optimization
- Streaming encoder/decoder
- Custom type handlers
Copyright © 2025 Vade Labs Pvt. Ltd.
Distributed under the MIT License. See LICENSE for details.