Thanks to visit codestin.com
Credit goes to github.com

Skip to content

DXL-0702/AgentClef

 
 

Repository files navigation

AgentClef

Audio is Evidence. Score is State.

An AI Agent assisted transcription review workbench that turns audio into an editable draft score, then helps musicians reason, correct, and confirm every uncertain passage.

Python FastAPI React Vite PostgreSQL Redis Status

English · 简体中文 · Docs · Architecture · v0.1


Why AgentClef?

AI transcription tools are getting better at producing a first draft. The hard part is turning that draft into a score a musician can trust.

AgentClef is designed around the review loop after the first transcription pass: listen, inspect, ask, compare, edit, confirm.

Problem One-shot AI Transcription AgentClef
Uncertain rhythm Hidden inside the generated result Exposed as local review targets
Wrong note duration User manually guesses and edits Agent reasons from audio evidence and beat context
AI edits Often opaque or destructive Proposed as CandidateEdits and applied only after confirmation
Score state Export files or model output Structured DraftScore as system truth
Accuracy goal First-pass model accuracy Final confirmed score accuracy

Core Philosophy

  • Evidence before notation — every musical event should be traceable back to audio time and beat position.
  • Draft as structured state — the internal score is not a PDF, image, or plain MIDI dump; it is an editable DraftScore.
  • Agent as reviewer, not owner — the Agent explains and proposes edits, while the system owns score state and the user confirms changes.
  • Local reasoning over global guessing — users select a note, measure, chord, or time range; the Agent reasons over that local musical context.
  • Accuracy through review — AgentClef measures success by how quickly users reach a correct final score, not only by the first generated draft.

Architecture

This is a high-level architecture snapshot. The full system design lives in docs/technical-architecture.md.

┌─────────────────────────────────────────────────────────────────┐
│                        Musician Workflow                        │
│   upload audio → draft score → local review → confirmed score   │
└────────────────────────────┬────────────────────────────────────┘
                             │
              ┌──────────────▼──────────────┐
              │     React + Vite Workbench   │
              │ waveform · note timeline ·   │
              │ Agent panel · edit preview   │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │       FastAPI Backend        │
              │ project · task · draft ·     │
              │ Agent context · edit engine  │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │     PostgreSQL + Redis       │
              │ DraftScore · revisions ·     │
              │ job queue · task state       │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │       Celery Worker          │
              │ FFmpeg → librosa → Basic     │
              │ Pitch → postprocess          │
              └──────────────┬──────────────┘
                             │
              ┌──────────────▼──────────────┐
              │      LLM Provider Adapter    │
              │ local context → structured   │
              │ CandidateEdit proposals      │
              └─────────────────────────────┘

Tech Stack

Full stack responsibilities are documented in docs/technology-stack.md.

Area Stack
Workbench React + TypeScript + Vite
Frontend State TanStack Query + Zustand
Backend Python 3.12+ + FastAPI + Pydantic
Persistence PostgreSQL + SQLAlchemy + Alembic
Jobs Redis + Celery
Audio / Agent FFmpeg + librosa + Basic Pitch + LLM provider adapter

Workflow Preview

1. Upload a local audio file
   → AgentClef creates a Project, AudioAsset, and TranscriptionJob

2. Generate a structured draft
   → the worker builds BeatGrid, NoteEvents, optional ChordEvents, and uncertainty markers

3. Review inside the workbench
   → waveform and editable note timeline stay aligned around the same DraftScore

4. Ask the Agent about a local passage
   → "How long should this high note last?"

5. Confirm a CandidateEdit
   → the Edit Engine validates the proposal, updates DraftScore, and writes a Revision

Project Structure

Target v0.1 structure:

AgentClef/
├── docs/        # product, architecture, model, and milestone documentation
├── server/      # FastAPI backend
├── worker/      # Celery tasks and audio pipeline
├── web/         # React + Vite workbench
├── shared/      # shared schema contracts or generated types
└── tests/       # backend, pipeline, contract, and E2E tests

AGENTS.md is a local collaboration instruction file and is not part of the public project documentation.

Roadmap

Milestone Status Focus
v0.1 Planning Local audio upload, async draft generation, timeline review, Agent CandidateEdit confirmation
v0.2 Planned Audio-score synchronization, loop playback, uncertainty navigation, candidate comparison
v0.3 Planned Accuracy fixtures, model adapter evaluation, beat and quantization improvements
v0.4 Planned Project lifecycle, revision browsing, MIDI and MusicXML export baseline
v0.5 Planned Chord timeline, transposition, instrument modes, stem-assisted review research
v1.0 Planned Stable AI Agent transcription review workbench

Development

AgentClef is currently in the v0.1 planning-to-implementation stage. The commands below describe the target local development flow after the foundation issue is implemented.

# Backend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
uvicorn server.main:app --reload

# Worker
celery -A worker.app:celery_app worker --loglevel=info

# Frontend
cd web
npm install
npm run dev

Quality gates:

# Backend
pytest
ruff check .

# Frontend
npm run test
npm run build

# E2E
npx playwright test

Documentation

Development Process

AgentClef uses issue-scoped development. Local collaboration instructions are kept in AGENTS.md, which is intentionally excluded from public documentation.

  1. Confirm the issue objective, scope, implementation points, tests, and acceptance criteria before coding.
  2. Implement only within the confirmed issue boundary.
  3. Run local quality gates before handing off.
  4. Use Conventional Commits.
  5. The developer performs git commit and git push.

Current Development Version

AgentClef is currently in v0.1: the minimum transcription review loop.


AgentClef — make every uncertain note reviewable.

About

An AI agent that autonomously listens, plans, and transcribes any audio into precise, playable sheet music—your intelligent key to every score.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors