Project Heiji

Heiji is a Raycast extension that turns any macOS text field into an instant voice-driven AI assistant. The MVP ships three commands:

Dictate – records speech, transcribes via OpenAI STT, and inserts text (or routes wake phrases to an LLM).
Settings – manages OpenAI credentials, language, silence detection, and dictionary override.
Retranscribe Last – retries the most recent recording without re-capturing audio.

Install & Run Locally (for testers)

Requirements

Raycast for macOS with the Developer CLI enabled.
Node.js 20.15.1 or newer (verified up to Node 24) and npm 10+.
SoX for audio capture (brew install sox).
An OpenAI API key with access to gpt-5-transcribe and gpt-4o-mini.

Step-by-step

Clone the project

git clone https://github.com/project-heiji/heiji.git
cd heiji

Install dependencies
```
npm install
```
Verify the build (optional but recommended)
```
npm run lint
npm run test
npm run build
```
Start the extension in Raycast
```
npx ray dev
```
The Raycast CLI will start a development session and open Raycast with the three Heiji commands available while the process is running. Keep this terminal window open for hot reloading. If the CLI is not found, ensure Raycast’s Developer CLI setting is enabled and the ray binary is on your PATH (which ray should return a path).
Assign a hotkey (optional)
- In Raycast Preferences → Extensions → Heiji, select the Dictate command.
- Click Add Shortcut and bind it to a convenient key (e.g. ⌥⇧D).

First-time Configuration

Open Raycast and run Heiji: Settings.
Enter your OpenAI API key. Raycast will store it securely.
Adjust preferences if needed:
- Dictation Language – ISO 639-1 code (e.g. en, fr).
- Silence Stop Seconds – auto-stop delay after you stop speaking.
- Dictionary Path Override – optional JSON file that biases transcription.
Close the form; preferences save automatically and you’ll see a confirmation toast.

You’re ready to dictate. Launch the Dictate command and start speaking—Heiji handles transcription, wake commands, and clipboard restore automatically.

Quick Technical Checks

sox --version should report 14.4 or later; if missing, install via brew install sox.
ray --version should output the bundled Raycast CLI version.
node -v should be ≥ 20.15.1 (tested through Node 24.x).
If SoX is installed outside your default PATH, export HEIJI_SOX_PATH=/full/path/to/sox (e.g., /opt/homebrew/bin/sox) before launching Raycast or the CLI.
If ambient noise requires tweaking silence detection, set HEIJI_SILENCE_THRESHOLD (e.g., 0.3%) to adjust when recording stops.
The transcription prompt explicitly mentions wake phrases (“Heiji”, “Hey G”, “Heyji”) so the STT model keeps them verbatim; avoid removing this guidance when modifying the prompt builder.

Everyday Usage Guide

Basic dictation
1. Trigger the Heiji: Dictate command.
2. Speak your text; Heiji stops recording after the configured silence window.
3. The transcript is pasted into the active application and your original clipboard is restored.
Wake phrase actions
- Start with “Heiji …” (e.g. “Heiji, summarize this into three bullet points”).
- The remainder is sent to gpt-4o-mini; the LLM response replaces the current selection.
Retrying a recording
- Run Heiji: Retranscribe Last to re-transcribe the most recent audio without speaking again.
Troubleshooting
- Toasts surface recoverable issues such as missing transcription or wake response.
- If you hit an error toast, re-run the Dictate command and try again.

Commands overview

Command	Purpose
`Heiji: Dictate`	Records audio and inserts the transcript or wake response into the front app
`Heiji: Settings`	Configures API key, language, silence timeout, and dictionary path
`Heiji: Retranscribe`	Reuses the cached audio file and re-runs transcription

Developer Workflow

If you are contributing to the extension, the repository defines convenience scripts:

npm run lint – runs ESLint (skips until dependencies are installed).
npm run test – executes the ts-node test suites (dictionary, audio, OpenAI client, insertion, wake, dictate, retranscribe, logger).
npm run build – type-checks the extension with tsc --noEmit.

Note: Commands are safe to run repeatedly; when dependencies are missing the script reports the skip explicitly.

Commands

Dictate

Trigger via Raycast (e.g., keyboard shortcut).
HUD flow: Listening… → Processing… → Inserted or Heiji Action complete for wake requests.
Wake phrases (Heiji, Hey G, Heyji) are stripped before being sent to gpt-4o-mini.
Clipboard content is restored automatically after insertion; Cmd+Z reverts the paste.

Settings

Exposes the four preferences required by the MVP:

Preference	Notes
`OpenAI API Key`	Stored securely by Raycast.
`Dictation Language`	ISO 639-1 code controlling STT language (default `fr`).
`Silence Stop Seconds`	Controls voice-activity timeout (default `2.0`).
`Dictionary Path`	Leave blank to use the auto-generated path in `~/Library/Application Support`.

Retranscribe Last

Surface Retranscribing… HUD while reusing the cached audio file from the last dictation.
On success inserts the new transcript and shows Retranscribed.
On failure displays a toast and leaves the clipboard untouched.

Dictionary Sample

A sample mapping is provided in dictionary.sample.json:

{
  "Billie": "Billie",
  "Raycast": "Raycast",
  "GPT-4": "GPT-4"
}

Drop a customised JSON file at the path set in Settings to bias the STT prompt.

Logging & Metrics

Each dictation run records transient metrics (timestamps t0–t4, total latency, transcript length, mode). Metrics live only in memory and can be inspected via a short Node snippet while the process is running:

npx ts-node -e "import { getMetrics, metricsAsMarkdown } from "./src/logger"; console.log(metricsAsMarkdown(getMetrics()));"

(Feel free to wrap this in a small Raycast debug command; metrics are never written to disk.)

Manual Verification Checklist

Run npm run lint && npm run test && npm run build after installing dependencies.
Dictate command inserts text and routes wake phrases (<1.5 s for short utterances).
Retranscribe command successfully retries the cached recording.
Clipboard is restored and Cmd+Z undoes each paste.
Metrics accessible via metricsAsMarkdown during the session.

Private Beta Publishing

Follow Raycast’s private publishing flow:

raycast login and raycast publish from the project root.
Select Private visibility and invite testers by Raycast handle.
Ensure the README, icon, and command descriptions meet store guidelines before submission.

Shipping the Extension

When you are ready to distribute a new build (private beta or public release), follow this checklist:

Verify the release

npm run lint
npm run test
npm run build
npx ray lint
npx ray build

ray build outputs an optimized bundle in ./dist.

Prepare release notes
- Summarise user-facing changes.
- Capture any migration steps (e.g. new permissions, required settings).

Publish through Raycast

npx ray login          # once per machine
npx ray publish        # choose Private or Public visibility

Follow the interactive prompts; attach the dist bundle that ray build produced if requested.

Tag the release (optional)

git tag -a vX.Y.Z -m "Heiji vX.Y.Z"
git push origin vX.Y.Z

Notify testers/users
- Share the Raycast store link.
- Mention the hotkey and new capabilities to highlight in onboarding.

For background, see BRIEF.md, SPEC.md, and PLAN.md.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__tests__		__tests__
assets		assets
docs		docs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BRIEF.md		BRIEF.md
PLAN.md		PLAN.md
README.md		README.md
SPEC.md		SPEC.md
TICKETS.md		TICKETS.md
dictionary.sample.json		dictionary.sample.json
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
raycast-env.d.ts		raycast-env.d.ts
temp.mjs		temp.mjs
tsconfig.json		tsconfig.json
tsconfig.test.json		tsconfig.test.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Project Heiji

Install & Run Locally (for testers)

Requirements

Step-by-step

First-time Configuration

Quick Technical Checks

Everyday Usage Guide

Commands overview

Developer Workflow

Commands

Dictate

Settings

Retranscribe Last

Dictionary Sample

Logging & Metrics

Manual Verification Checklist

Private Beta Publishing

Shipping the Extension

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

pezzos/heiji

Folders and files

Latest commit

History

Repository files navigation

Project Heiji

Install & Run Locally (for testers)

Requirements

Step-by-step

First-time Configuration

Quick Technical Checks

Everyday Usage Guide

Commands overview

Developer Workflow

Commands

Dictate

Settings

Retranscribe Last

Dictionary Sample

Logging & Metrics

Manual Verification Checklist

Private Beta Publishing

Shipping the Extension

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages