A complete .NET library for working with TOON (Token-Oriented Object Notation) - a compact, human-readable text format optimized for LLM prompts and structured data interchange.
TOON is a line-oriented, indentation-based notation that encodes JSON data with explicit structure and minimal quoting. Think of it as:
- More compact than JSON for arrays of uniform objects (no repeated keys)
- More structured than CSV with nesting, types, and field names
- More deterministic than YAML with explicit array lengths and fixed formatting rules
Perfect for LLM prompts, configuration files, and data interchange where token efficiency and readability matter.
- β Complete lexical analyzer with all TOON token types
- β Full AST parser with resilient error recovery
- β Token-to-AST navigation - easily map between tokens and syntax nodes
- β Tokens included in parse results - no separate tokenization call needed
- β Position tracking for every token and AST node (line, column, span)
- β Resilient parsing - continues after errors, returns partial AST
- β Rich error reporting - collects all errors with precise locations and error codes
- β Standardized error codes - 20+ error codes (TOON1xxx-9xxx) for programmatic handling
- β Context-aware error messages - every error explains what, why, and how to fix
- β Visitor pattern for AST traversal and transformation
- β Extension methods for syntax highlighting and IDE integration
- β TOON spec Β§6.1 compliance - array size validation (detects size mismatches)
- β Battle-tested with 637 unit tests (100% passing)
Targets: .NET Standard 2.0 (maximum compatibility)
dotnet add package ToonTokenizerOr via Package Manager Console:
Install-Package ToonTokenizerusing ToonTokenizer;
var source = @"
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
";
// Parse returns: Document (AST), Errors (if any), and Tokens
var result = Toon.Parse(source);
if (result.IsSuccess)
{
// Access the parsed document
foreach (var property in result.Document.Properties)
{
Console.WriteLine($"{property.Key}: {property.Value}");
}
// Access tokens for syntax highlighting
foreach (var token in result.Tokens)
{
Console.WriteLine($"{token.Type}: '{token.Value}' at {token.Line}:{token.Column}");
}
}
else
{
// Resilient parsing: you still get a partial AST + all errors
Console.WriteLine($"Found {result.Errors.Count} error(s):");
foreach (var error in result.Errors)
{
Console.WriteLine($" Line {error.Line}: {error.Message}");
}
}if (Toon.TryParse(source, out var result))
{
if (result.IsSuccess)
Console.WriteLine("β Valid TOON");
else
Console.WriteLine($"β {result.Errors.Count} error(s) found");
}// Get tokens without parsing
var tokens = Toon.Tokenize(source);
foreach (var token in tokens)
{
Console.WriteLine($"{token.Type}: {token.Value}");
}var source = "name: John\nage: 30";
var result = Toon.Parse(source);
// Get a token and find which AST node it belongs to
var token = result.Tokens.Find(t => t.Value == "30");
var property = token.GetPropertyNode(result.Document);
Console.WriteLine($"Token '{token.Value}' belongs to property: {property.Key}");
// Output: Token '30' belongs to property: age
// Or find property at a specific line/column
var prop = result.GetPropertyAt(line: 2, column: 1);
Console.WriteLine($"Property at line 2: {prop.Key}");
// Output: Property at line 2: age
// Find nested properties by path
var theme = result.FindPropertyByPath("user.settings.theme");
if (theme?.Value is StringValueNode str)
{
Console.WriteLine($"Theme: {str.Value}");
}name: John Doe
age: 30
active: true
email: [email protected]
user:
name: Jane Smith
email: [email protected]
settings:
theme: dark
notifications: true
colors[3]: red,green,blue
scores[5]: 95,87,92,88,91
The killer feature! No repeated keys:
users[3]{id,name,email,active}:
1,Alice,[email protected],true
2,Bob,[email protected],false
3,Charlie,[email protected],true
Compare to JSON:
{
"users": [
{"id": 1, "name": "Alice", "email": "[email protected]", "active": true},
{"id": 2, "name": "Bob", "email": "[email protected]", "active": false},
{"id": 3, "name": "Charlie", "email": "[email protected]", "active": true}
]
}60% fewer tokens! π
context:
task: Favorite hiking trails
location: Boulder, CO
season: Spring 2025
friends[3]: Ana,Luis,Sam
hikes[3]{id,name,distance,elevation,companion,sunny}:
1,Blue Lake Trail,7.5,320,Ana,true
2,Ridge Overlook,9.2,540,Luis,false
3,Wildflower Loop,5.1,180,Sam,true
notes:
best: Ridge Overlook has amazing views!
bring: Water, snacks, sunscreen
// Parse TOON source (returns Document, Errors, and Tokens)
ToonParseResult Parse(string source)
// Validate and parse (returns true for completed parse, even with errors)
bool TryParse(string source, out ToonParseResult result)
// Tokenize only
List<Token> Tokenize(string source)public class ToonParseResult
{
public ToonDocument Document { get; } // Always available (even with errors)
public List<ToonError> Errors { get; } // Empty if no errors
public List<Token> Tokens { get; } // All tokens from lexing
public bool IsSuccess => Errors.Count == 0;
public bool HasErrors => Errors.Count > 0;
}public class ToonError
{
public string Message { get; }
public string? Code { get; } // Error code (e.g., "TOON1001")
public int Position { get; } // 0-based character offset
public int Length { get; } // Length of error span
public int Line { get; } // 1-based line number
public int Column { get; } // 1-based column number
public int EndPosition { get; } // Position + Length
}All errors include standardized error codes for programmatic handling and filtering:
var result = Toon.Parse(source);
foreach (var error in result.Errors)
{
// Errors include descriptive messages with fix suggestions
Console.WriteLine($"[{error.Code}] {error.Message}");
// Filter by error type
if (error.Code?.StartsWith("TOON1") == true)
Console.WriteLine(" β Lexer/tokenization error");
else if (error.Code?.StartsWith("TOON2") == true)
Console.WriteLine(" β Parser structural error");
else if (error.Code?.StartsWith("TOON3") == true)
Console.WriteLine(" β Validation error");
}Error Code Categories:
| Category | Range | Description | Examples |
|---|---|---|---|
| Lexer | TOON1xxx | Tokenization errors | TOON1001 Unterminated stringTOON1002 Invalid escape sequenceTOON1003 Invalid character |
| Parser | TOON2xxx | Structural errors | TOON2001 Expected property keyTOON2002 Expected colonTOON2003 Expected right bracketTOON2004 Expected valueTOON2005 Expected delimiter |
| Validation | TOON3xxx | Semantic errors | TOON3001 Array size mismatchTOON3002 Table array size mismatchTOON3003 Table row field mismatch |
| Delimiters | TOON4xxx | Delimiter issues | TOON4001 Mixed delimitersTOON4002 Delimiter marker misplaced |
| Indentation | TOON5xxx | Indentation problems | TOON5001 Unexpected indentationTOON5002 Inconsistent indentation |
| Internal | TOON9xxx | Library bugs | TOON9001 Infinite loop detected |
Context-Aware Error Messages:
Every error includes:
- β What went wrong - Clear description of the problem
- β Why it's wrong - Explanation of the rule that was violated
- β How to fix it - Actionable suggestions for correction
Example error messages:
// Unterminated string
[TOON1001] Unterminated double-quoted string at line 5, column 10.
String reached end of line without closing " character.
Fix: Add closing " before the end of the line
// Invalid escape sequence
[TOON1002] Invalid escape sequence '\x' at line 3, column 15.
Valid escape sequences: \n, \r, \t, \\, \", \'.
Fix: Use a valid escape sequence or remove the backslash
// Array size mismatch
[TOON3001] Array size mismatch: declared 5 elements, but found 3.
Missing 2 elements. Check if array is incomplete or elements are on wrong indentation level.
Fix: Either add 2 more elements or change the size declaration [5]β[3]
// Table size mismatch with helpful hint
[TOON3002] Table array size mismatch: declared 10 rows, but found 8.
Missing 2 rows. Check if rows are incomplete or have incorrect indentation.
Fix: Either add 2 more rows or update the size [10]β[8]public enum TokenType
{
// Values
String, Number, True, False, Null, Identifier,
// Structure
Colon, Comma, Pipe,
LeftBracket, RightBracket,
LeftBrace, RightBrace,
// Formatting
Newline, Indent, Dedent, Whitespace,
// Special
Comment, EndOfFile, Invalid
}All inherit from AstNode with position tracking:
// Document root
ToonDocument // Contains Properties[]
// Structural
PropertyNode // Key + Value
ObjectNode // Nested object with Properties[]
// Arrays
ArrayNode // Simple array with Elements[]
TableArrayNode // Tabular with Schema[] and Rows[][]
// Values
StringValueNode // String literal
NumberValueNode // Numeric (integer or float)
BooleanValueNode // true/false
NullValueNode // nullEvery node includes:
int StartLine, StartColumn, StartPosition
int EndLine, EndColumn, EndPositionThe parser continues after errors, returning a partial AST and all error locations:
var source = @"
name: John
invalid line here
city: Boulder
";
var result = Toon.Parse(source);
// result.Document has 2 valid properties (name, city)
// result.Errors has 1 error (line 3)Perfect for:
- IDE integration (IntelliSense on valid parts)
- Error highlighting (show all errors at once)
- Language servers
- Linters and validators
using ToonTokenizer;
var tokens = Toon.Tokenize(source);
// Get tokens on specific line
var lineTokens = tokens.GetTokensOnLine(5);
// Find token at position
var token = tokens.GetTokenAt(line: 3, column: 10);
// Filter by type
var strings = tokens.GetTokensByType(TokenType.String);
// Syntax highlighting classification
foreach (var token in tokens)
{
string cssClass = token.GetClassification();
// Returns: "keyword", "string", "number", "comment", etc.
}
// Check categories
bool isKeyword = token.IsKeyword(); // true, false, null
bool isStructural = token.IsStructural(); // :, [, ], {, }, ,
bool isValue = token.IsValue(); // strings, numbers, booleansusing ToonTokenizer;
using ToonTokenizer.Ast;
var result = Toon.Parse(source);
// From token to AST node
var token = result.Tokens.GetTokenAt(line: 5, column: 3);
var node = token.GetAstNode(result.Document);
var property = token.GetPropertyNode(result.Document);
// From parse result directly
var nodeAtPosition = result.GetNodeAtPosition(42);
var nodeForToken = result.GetNodeForToken(myToken);
var propertyAt = result.GetPropertyAt(line: 3, column: 5);
// Get all properties (including nested)
var allProps = result.GetAllProperties();
// Find by path (dot notation)
var theme = result.FindPropertyByPath("user.settings.theme");
var email = result.FindPropertyByPath("user.email");
if (theme?.Value is StringValueNode str)
{
Console.WriteLine($"Theme: {str.Value}");
}Implement custom AST processing:
using ToonTokenizer.Ast;
public class MyVisitor : IAstVisitor<string>
{
public string VisitDocument(ToonDocument node)
{
var results = node.Properties.Select(p => p.Accept(this));
return string.Join(", ", results);
}
public string VisitProperty(PropertyNode node)
{
return $"{node.Key} = {node.Value.Accept(this)}";
}
public string VisitStringValue(StringValueNode node)
{
return $"\"{node.Value}\"";
}
// ... implement other Visit methods
}
// Use it
var doc = Toon.Parse(source).Document;
var output = doc.Accept(new MyVisitor());public IEnumerable<ClassificationSpan> GetClassificationSpans(SnapshotSpan span)
{
var source = span.GetText();
var result = Toon.Parse(source); // Gets tokens + AST in one call
foreach (var token in result.Tokens)
{
var classification = token.GetClassification();
var tokenSpan = new SnapshotSpan(
span.Snapshot,
token.Position,
token.Length
);
yield return new ClassificationSpan(
tokenSpan,
GetClassificationType(classification)
);
}
}public IEnumerable<Completion> GetCompletions(int line, int column)
{
var result = Toon.Parse(documentText);
// Find the property we're currently in
var property = result.GetPropertyAt(line, column);
if (property != null)
{
// Context-aware suggestions based on property type
if (property.Value is ObjectNode)
{
// Suggest nested property names
yield return new Completion("theme");
yield return new Completion("enabled");
}
else if (property.Value is ArrayNode)
{
// Suggest array-specific completions
yield return new Completion("[size]");
}
}
var token = result.Tokens.GetTokenAt(line, column);
if (token?.Type == TokenType.Colon)
{
// After colon: suggest value types
yield return new Completion("true");
yield return new Completion("false");
yield return new Completion("null");
}
}public IEnumerable<Diagnostic> GetDiagnostics()
{
var result = Toon.Parse(documentText);
foreach (var error in result.Errors)
{
yield return new Diagnostic
{
Severity = DiagnosticSeverity.Error,
Message = error.Message,
Range = new Range(
error.Line - 1,
error.Column - 1,
error.EndPosition
)
};
}
}public IEnumerable<FoldingRange> GetFoldingRanges()
{
var result = Toon.Parse(documentText);
foreach (var property in result.Document.Properties)
{
if (property.Value is ObjectNode obj && obj.Properties.Count > 0)
{
yield return new FoldingRange
{
StartLine = obj.StartLine,
EndLine = obj.EndLine,
Kind = FoldingRangeKind.Region
};
}
else if (property.Value is TableArrayNode table && table.Rows.Count > 5)
{
yield return new FoldingRange
{
StartLine = table.StartLine,
EndLine = table.EndLine,
Kind = FoldingRangeKind.Region
};
}
}
}// Convert verbose JSON to compact TOON for token savings
var jsonData = GetDataFromApi();
var toonEncoder = new ToonEncoder();
var compactPrompt = toonEncoder.Encode(jsonData);
// Use in prompt
var prompt = $@"
Analyze this data:
{compactPrompt}
What insights can you provide?
";- Full TOON v3.0 specification support
- Handles all array types (inline, tabular, nested)
- Complete delimiter support (comma, tab, pipe)
- Resilient parsing with error recovery
- 637 unit tests covering edge cases (100% passing)
- Battle-tested on complex real-world data
- Handles malformed input gracefully
- Comprehensive error reporting with standardized error codes
- Context-aware error messages with actionable fix suggestions
- Efficient single-pass lexer
- Minimal allocations
- Streaming-friendly design
- .NET Standard 2.0 for maximum compatibility
- Rich IntelliSense support
- Extensive XML documentation
- Position tracking on everything
- Extension methods for common tasks
- Visitor pattern for AST traversal
- Hook points for custom behavior
- Clean separation of concerns
- Easy to integrate into larger systems
This library implements the TOON v3.0 specification. The full spec is included in spec.md.
Key features:
- β Deterministic encoding
- β Lossless round-tripping
- β Strict and lenient parsing modes
- β Position tracking for all tokens
- β Table array detection
- β Delimiter scoping rules
- β Escape sequence handling
- β Array size validation per Β§6.1 (detects undersized arrays)
| Platform | Support |
|---|---|
| .NET Core 2.0+ | β |
| .NET Framework 4.6.1+ | β |
| .NET 5, 6, 7, 8, 9, 10 | β |
| Mono | β |
| Xamarin | β |
| Unity | β (via .NET Standard 2.0) |
Typical parse performance on modern hardware:
| Document Size | Parse Time | Tokens/sec |
|---|---|---|
| 1 KB | < 1 ms | 500K |
| 10 KB | 2-5 ms | 400K |
| 100 KB | 20-40 ms | 350K |
| 1 MB | 200-300 ms | 300K |
Benchmarks vary based on document structure and hardware.
- Quick Start: See examples above
- Token Access: Examples/TokensInParseResult.md
- API Documentation: XML docs included in package
Check out the Examples directory for:
- Basic parsing examples
- Syntax highlighter implementation
- Error handling patterns
- AST visitor examples
- Token manipulation
Contributions welcome! Please:
-
Follow existing code style
- Use the
.editorconfigsettings - Keep methods focused and well-named
- Add XML documentation for public APIs
- Use the
-
Write tests
- Add tests for new features
- Ensure all existing tests pass
- Aim for high code coverage
-
Update documentation
- Update README for user-facing changes
- Add examples for new features
- Keep spec compliance notes current
Run the full test suite:
dotnet testTest coverage:
- Lexer: Token generation, escape sequences, position tracking
- Parser: All node types, error recovery, edge cases
- Validation: Array size validation, string format validation, number format validation
- Extensions: Helper methods, visitor pattern
- Integration: Round-trip encoding/decoding
Apache License 2.0 - See LICENSE.txt file for details.
This library is independent from the TOON specification but implements it faithfully. The specification itself is MIT licensed.
- TOON Specification: https://github.com/toon-format/spec
- Reference Implementation (TypeScript): https://github.com/toon-format/toon
- This Library: https://github.com/madskristensen/ToonTokenizer
- NuGet Package: https://www.nuget.org/packages/ToonTokenizer/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Spec Questions: TOON Spec Repo
Mads Kristensen - GitHub | Twitter
Implementing the TOON specification by Johann Schopplich - @johannschopplich
Made with β€οΈ for the .NET community