AgentClef

Audio is Evidence. Score is State.

An AI Agent assisted transcription review workbench that turns audio into an editable draft score, then helps musicians reason, correct, and confirm every uncertain passage.

English · 简体中文 · Docs · Architecture · v0.1

Why AgentClef?

AI transcription tools are getting better at producing a first draft. The hard part is turning that draft into a score a musician can trust.

AgentClef is designed around the review loop after the first transcription pass: listen, inspect, ask, compare, edit, confirm.

Problem	One-shot AI Transcription	AgentClef
Uncertain rhythm	Hidden inside the generated result	Exposed as local review targets
Wrong note duration	User manually guesses and edits	Agent reasons from audio evidence and beat context
AI edits	Often opaque or destructive	Proposed as CandidateEdits and applied only after confirmation
Score state	Export files or model output	Structured DraftScore as system truth
Accuracy goal	First-pass model accuracy	Final confirmed score accuracy

Core Philosophy

Evidence before notation — every musical event should be traceable back to audio time and beat position.
Draft as structured state — the internal score is not a PDF, image, or plain MIDI dump; it is an editable DraftScore.
Agent as reviewer, not owner — the Agent explains and proposes edits, while the system owns score state and the user confirms changes.
Local reasoning over global guessing — users select a note, measure, chord, or time range; the Agent reasons over that local musical context.
Accuracy through review — AgentClef measures success by how quickly users reach a correct final score, not only by the first generated draft.

Architecture

This is a high-level architecture snapshot. The full system design lives in docs/technical-architecture.md.

┌─────────────────────────────────────────────────────────────────┐
│                        Musician Workflow                        │
│   upload audio → draft score → local review → confirmed score   │
└────────────────────────────┬────────────────────────────────────┘
                             │
              ┌──────────────▼──────────────┐
              │     React + Vite Workbench   │
              │ waveform · note timeline ·   │
              │ Agent panel · edit preview   │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │       FastAPI Backend        │
              │ project · task · draft ·     │
              │ Agent context · edit engine  │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │     PostgreSQL + Redis       │
              │ DraftScore · revisions ·     │
              │ job queue · task state       │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │       Celery Worker          │
              │ FFmpeg → librosa → Basic     │
              │ Pitch → postprocess          │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │      LLM Provider Adapter    │
              │ local context → structured   │
              │ CandidateEdit proposals      │
              └─────────────────────────────┘

Tech Stack

Full stack responsibilities are documented in docs/technology-stack.md.

Area	Stack
Workbench	React + TypeScript + Vite
Frontend State	TanStack Query + Zustand
Backend	Python 3.12+ + FastAPI + Pydantic
Persistence	PostgreSQL + SQLAlchemy + Alembic
Jobs	Redis + Celery
Audio / Agent	FFmpeg + librosa + Basic Pitch + LLM provider adapter

Workflow Preview

1. Upload a local audio file
   → AgentClef creates a Project, AudioAsset, and TranscriptionJob

2. Generate a structured draft
   → the worker builds BeatGrid, NoteEvents, optional ChordEvents, and uncertainty markers

3. Review inside the workbench
   → waveform and editable note timeline stay aligned around the same DraftScore

4. Ask the Agent about a local passage
   → "How long should this high note last?"

5. Confirm a CandidateEdit
   → the Edit Engine validates the proposal, updates DraftScore, and writes a Revision

Project Structure

Target v0.1 structure:

AgentClef/
├── docs/        # product, architecture, model, and milestone documentation
├── server/      # FastAPI backend
├── worker/      # Celery tasks and audio pipeline
├── web/         # React + Vite workbench
├── shared/      # shared schema contracts or generated types
└── tests/       # backend, pipeline, contract, and E2E tests

AGENTS.md is a local collaboration instruction file and is not part of the public project documentation.

Roadmap

Milestone	Status	Focus
v0.1	Planning	Local audio upload, async draft generation, timeline review, Agent CandidateEdit confirmation
v0.2	Planned	Audio-score synchronization, loop playback, uncertainty navigation, candidate comparison
v0.3	Planned	Accuracy fixtures, model adapter evaluation, beat and quantization improvements
v0.4	Planned	Project lifecycle, revision browsing, MIDI and MusicXML export baseline
v0.5	Planned	Chord timeline, transposition, instrument modes, stem-assisted review research
v1.0	Planned	Stable AI Agent transcription review workbench

Development

AgentClef is currently in the v0.1 planning-to-implementation stage. The commands below describe the target local development flow after the foundation issue is implemented.

# Backend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
uvicorn server.main:app --reload

# Worker
celery -A worker.app:celery_app worker --loglevel=info

# Frontend
cd web
npm install
npm run dev

Quality gates:

# Backend
pytest
ruff check .

# Frontend
npm run test
npm run build

# E2E
npx playwright test

Documentation

Development Process

AgentClef uses issue-scoped development. Local collaboration instructions are kept in AGENTS.md, which is intentionally excluded from public documentation.

Confirm the issue objective, scope, implementation points, tests, and acceptance criteria before coding.
Implement only within the confirmed issue boundary.
Run local quality gates before handing off.
Use Conventional Commits.
The developer performs git commit and git push.

Current Development Version

AgentClef is currently in v0.1: the minimum transcription review loop.

AgentClef — make every uncertain note reviewable.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
alembic		alembic
docs		docs
server		server
shared		shared
tests		tests
web		web
worker		worker
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
README.zh-CN.md		README.zh-CN.md
alembic.ini		alembic.ini
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentClef

Why AgentClef?

Core Philosophy

Architecture

Tech Stack

Workflow Preview

Project Structure

Roadmap

Development

Documentation

Development Process

Current Development Version

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentClef

Why AgentClef?

Core Philosophy

Architecture

Tech Stack

Workflow Preview

Project Structure

Roadmap

Development

Documentation

Development Process

Current Development Version

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages