Token-Oriented Object Notation is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage.
This is an OCaml port of the TOON library originally written in TypeScript.
TOON excels at uniform complex objects – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts.
AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money – and standard JSON is verbose and token-expensive:
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" }
]
}TOON conveys the same information with fewer tokens:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
- 💸 Token-efficient: typically 30–60% fewer tokens than JSON
- 🤿 LLM-friendly guardrails: explicit lengths and field lists help models validate output
- 🍱 Minimal syntax: removes redundant punctuation (braces, brackets, most quotes)
- 📐 Indentation-based structure: replaces braces with whitespace for better readability
- 🧺 Tabular arrays: declare keys once, then stream rows without repetition
NOTE:
ocaml-tooncurrently works exclusively with yojson. Support for other JSON libraries or custom adapters could be added in the future, please open an issue.
Currently not published to opam repository yet, it needs pinning
opam pin add toon.dev "https://github.com/davesnx/ocaml-toon.git"add the pin in your dune-project and run dune pkg lock and dune build
(pin
(url "git+https://github.com/davesnx/ocaml-toon.git")
(package (name toon)))
(package
(name your-package)
(depends
toon))let data = Yojson.Basic.from_string {|
{
"user": {
"id": 123
"name": "Ada"
"tags": ["reading", "gaming"]
"active": true
"preferences": []
}
}
|}
let () =
print_endline (Toon.encode data)
(*
user:
id: 123
name: Ada
tags[2]: reading,gaming
active: true
preferences[0]:
*)You can also decode TOON back to JSON values:
let toon = {|
user:
id: 123
name: Ada
tags[2]: reading,gaming
active: true
preferences[0]:
|}
let () =
match Toon.decode toon with
| Ok (value: Yojson.Basic.t) ->
Printf.printf "%s\n" (Yojson.Basic.to_string value)
| Error error ->
Printf.eprintf "Decode error: %s\n" (Toon.error_to_string error)Decodes a TOON-formatted string into a JSON value. Returns Ok value on success or Error error on decode failure.
type error =
[ `Unterminated_quoted_string
| `Expected_quote
| `Invalid_escape_sequence
| `No_colon_in_line of string
| `Invalid_array_syntax
| `Array_length_mismatch
| `Invalid_number_format ]match Toon.decode "tags[3]: a,b,c" with
| Ok json -> Printf.printf "%s\n" (Yojson.Basic.to_string json)
| Error error -> Printf.eprintf "Error: %s\n" (Toon.error_to_string error)Encodes a JSON value to TOON format. Returns a TOON-formatted string with no trailing newline or spaces.
Toon.encode (`Assoc [("id", `Int 1); ("name", `String "Ada")])
(* => "id: 1\nname: Ada" *)Pretty-print TOON format using OCaml's Format module.
let data = `Assoc [("id", `Int 123); ("name", `String "Ada")]
let s = Format.asprintf "%a" Toon.pp dataConvert a decode error to a human-readable string.
match Toon.decode "invalid[" with
| Ok _ -> ()
| Error err ->
Printf.eprintf "Decode failed: %s\n" (Toon.error_to_string err)TOON formatting is deterministic and minimal:
- Indentation: 2 spaces per nesting level.
- Lines:
key: valuefor primitives (single space after colon).key:for nested/empty objects (no trailing space on that line).
- Arrays:
- Delimiter encoding: Comma delimiters are implicit in array headers (e.g.,
tags[3]:,items[2]{id,name}:). Tab and pipe delimiters are explicitly shown in array headers (e.g.,tags[3|]:,items[2 ]{id name}:). - Primitive arrays inline:
key[N]: v1,v2(comma) orkey[N<delim>]: v1<delim>v2(tab/pipe). - Tabular arrays:
key[N]{f1,f2}: …(comma) orkey[N<delim>]{f1<delim>f2}: …(tab/pipe). - List items: two spaces, hyphen, space (
" - …").
- Delimiter encoding: Comma delimiters are implicit in array headers (e.g.,
- Whitespace invariants:
- No trailing spaces at end of any line.
- No trailing newline at end of output.
Simple objects with primitive values:
Toon.encode (`Assoc [
("id", `Int 123);
("name", `String "Ada");
("active", `Bool true)
])id: 123
name: Ada
active: true
Nested objects:
Toon.encode (`Assoc [
("user", `Assoc [
("id", `Int 123);
("name", `String "Ada")
])
])user:
id: 123
name: Ada
Tip: TOON includes the array length in brackets (e.g.,
items[3]). When using comma delimiters (default), the delimiter is implicit. When using tab or pipe delimiters, the delimiter is explicitly shown in the header (e.g.,tags[2|]or[2 ]). This encoding helps LLMs identify the delimiter and track the number of elements, reducing errors when generating or validating structured output.
Toon.encode (`Assoc [
("tags", `List [`String "admin"; `String "ops"; `String "dev"])
])tags[3]: admin,ops,dev
When all objects share the same primitive fields, TOON uses an efficient tabular format:
Toon.encode (`Assoc [
("items", `List [
`Assoc [
("sku", `String "A1");
("qty", `Int 2);
("price", `Float 9.99)
];
`Assoc [
("sku", `String "B2");
("qty", `Int 1);
("price", `Float 14.5)
]
])
])items[2]{sku,qty,price}:
A1,2,9.99
B2,1,14.5
After checking out the repo, ensure you have dune preview installed (https://preview.dune.build) and install dependencies:
make install # install dependencies
make build # build the project
make test # run the tests
make test-watch # run the tests in watch
make bench # run some benchmarks
make utop # run a repl with the lib loadede- Fork it (https://github.com/davesnx/ocaml-toon/fork)
- Create your feature branch (
git checkout -b my-new-feature) - Commit your changes (
git commit -am 'Add some feature') - Push to the branch (
git push origin my-new-feature) - Create a new Pull Request
The project is available as open source under the terms of the MIT License.
This is an OCaml port of the original TOON library by Johann Schopplich.