A pi extension that adds native PDF, video, and audio support to pi's built-in read tool, with provider-aware allowlisting so binary content is only sent to models that can actually consume it.
# From git
pi install git:github.com/sshkeda/pi-read
# From local path
pi install /path/to/pi-readFor native OpenRouter audio/video/PDF transport on Pi's OpenAI-compatible path, also apply the companion patches via pi-patches, which reads this repo's patches/pi-patches.json through its sources.json.
pi -e /path/to/pi-read/src/index.tsThen ask pi to read a file as usual:
read presentation.pdf
read demo.mp4
read podcast.mp3
Pi's built-in read tool supports text and images (jpg, png, gif, webp); this extension extends read to also handle:
| Format | MIME Types |
|---|---|
| Video | video/mp4, video/webm, video/quicktime (.mov) |
| Audio | audio/mpeg (.mp3), audio/wav, audio/ogg (.ogg/.oga/.opus/.spx), audio/mp4 (.m4a) |
application/pdf |
Text and images are delegated to the built-in read tool unchanged, including image resizing and truncation.
Before each LLM call, a context event handler checks the current provider/model route against an explicit allowlist. Binary content only goes to providers that can ingest it:
| Route | Behaviour |
|---|---|
google, google-vertex, google-gemini-cli |
✅ Sent as inlineData |
OpenRouter / google/gemini-3.1-pro-preview |
✅ Sent natively via pi-patches (file for PDFs, video_url for video, input_audio for audio) |
OpenRouter / google/gemini-3-flash-preview |
✅ Sent natively via pi-patches |
| Anthropic / Claude, OpenAI / ChatGPT, other OpenRouter models | [PDF file omitted — current model does not support inline PDF content] |
Stripping operates on a structuredClone of the message history, so the persisted session stays small (lightweight local file refs, not giant base64 blobs in the transcript). Switching to an allowed model later rehydrates the original file.
cd ../pi-mock && npm run build
cd ../pi-read && node --test --test-force-exit test/test-pi-read.mjsTests live in test/test-pi-read.mjs and use the sibling pi-mock harness to simulate provider requests. They verify:
- Google: PDF/MP4/audio sent with correct MIME type
- Anthropic / OpenAI: binary stripped to text fallback
- OpenRouter allowlist (with pi-patches applied) passes PDF / MP4 / M4A through as
file/video_url/input_audio - Session safety: large audio reads persist local refs instead of base64 blobs; later turns rehydrate correctly
- Text files always delegated to built-in
read
- pi coding agent
- An allowed multimodal Gemini route for native PDF/video/audio
pi-patchesapplied if you want native OpenRouter audio/video/PDF transport on the OpenAI-compatible path
MIT