Factly

Factly is a modern CLI tool designed to evaluate the factuality of Large Language Models (LLMs) on the Massive Multitask Language Understanding (MMLU) benchmark. It provides a robust framework for prompt engineering experiments and factual accuracy assessment.

Features

Evaluate LLM factuality on the MMLU benchmark with detailed results
Support for various prompt engineering experiments via configurable system instructions
Generate comparative visualizations of factuality scores across models and prompts
Structured output for easy analysis and comparison
Built with modern Python tooling (Python 3.12, uv, click, pydantic)
Extensible and reproducible evaluation workflows

Note

Currently, only OpenAI models are supported.

Quick Start

# Run MMLU evaluation with default settings
factly mmlu

# Run MMLU evaluation and generate plots
factly mmlu --plot

# Get help on all available options
factly mmlu --help

# Get help on all available commands
factly --help

That's it! The tool uses optimized default parameters and saves all outputs to the output directory.

Note

For detailed installation instructions, please see the Installation Guide. And for usage instructions, use cases, examples, and advanced configuration options, please see the Usage Guide.

Project Information

Factly is released under the MIT License, its documentation lives at Read the Docs, the code on GitHub, and the latest release on PyPI. It's rigorously tested on Python 3.12+.

If you'd like to contribute to Factly you're most welcome!

Support

Should you have any question, any remark, or if you find a bug, or if there is something you can't do with the Factly, please open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.github		.github
docs		docs
factly		factly
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE		LICENSE
Makefile		Makefile
README.rst		README.rst
instructions.yaml		instructions.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Factly

Features

Quick Start

Project Information

Support

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

sergeyklay/factly

Folders and files

Latest commit

History

Repository files navigation

Factly

Features

Quick Start

Project Information

Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages