Human do Marginalia, AIs do Zettelkasten
A web reader for researchers. PDFs are parsed into addressable blocks; notes cite blocks; accumulated notes grow into auto-built wiki pages, paper graphs, and agent-facing memory. AI is summoned, not assumed.
Most "AI for papers" tools answer for you. Sapientia is built for the opposite habit: you read, and the assistant stays out of the way until you summon it. When you do summon it, every claim is grounded in addressable blocks of the paper you're reading, citations stay verifiable, and the notes you write become the knowledge base — not a chat log. A more detailed motivation is in Human do Marginalia, AIs do Zettelkasten
- Block-addressable PDFs. Papers are parsed by MinerU into stable, content-hashed blocks. Every figure, equation, paragraph, and table has its own ID —
paperId#blockId— that survives re-parse. - Side-by-side reading. PDF and parsed-Markdown views of the same paper, kept in sync. Click a block in either pane to focus it in the other; both views remember scroll on toggle.
- Citations as first-class data. Notes are written in a Tiptap-based editor (built on the Novel primitives) and embed
@[block N]chips that link to the source block. Click a chip to jump to the exact figure or paragraph it cites. - Highlights with semantics. A built-in five-color palette (Questioning / Important / Original / Pending / Conclusion) plus user-defined palettes. Highlights persist per-block, render in both views, and tag the citation chip with the same color.
- Reader markup. Highlight, underline, and freehand ink on the PDF itself — overlay-only, your original PDF is never modified.
- Paper-local knowledge substrate. Paper compilation produces source summaries, local concepts, source-grounded concept edges, and reader signal overlays. The graph is primarily agent-facing substrate, with read-only visual projections for inspection.
- Note-native Ask. AI answers from inside the note flow, using explicit context layers: current focus, session trace, paper-local source graph, reader observations, and gated long-term memory. Paper claims must stay grounded in block citations.
- Research memory compiler. Notes, highlights, annotations, and saved Ask traces are distilled into reader observations, user bridge edges, and a bounded
research_memory_capsulewith Key research facts / Today / Earlier this week / Long-term research context. - Interaction profile. Language, explanation style, workflow, and personalization boundaries are compiled separately from research memory, so collaboration preferences do not get mixed with paper truth.
- Memory controls. Settings exposes Memory & personalization controls: enable/disable research memory in Ask, enable/disable interaction profile use, control auto-compilation, inspect compiled memory, and record corrections that feed the next compiler pass.
- Restraint-first AI. No external tools/search in the current memory phase, no auto-summoning, and no workspace-wide context bleed unless the memory planner explicitly requests it.
- Self-hostable. Bring your own MinerU token and Anthropic / OpenAI key. Postgres + Redis + RustFS/S3-compatible object storage run in your cluster.
# Builds and starts web, API, worker, migrations, Postgres, Redis, and RustFS.
pnpm infra:upOpen http://localhost:8080. The API is also exposed on http://localhost:3000,
RustFS/S3 on http://localhost:9000, and the RustFS console on http://localhost:9001.
For anything beyond local testing, copy infra/docker/.env.example, replace the
secrets, and pass it to Compose:
cp infra/docker/.env.example infra/docker/.env
docker compose --env-file infra/docker/.env -f infra/docker/docker-compose.yml up -d --build --force-recreateGitHub Actions publishes sapientia-api and sapientia-web images to GHCR on
pushes to the publish branch, version branches such as v0.1, version tags
such as v0.1, and manual dispatch. The publish branch produces latest;
each v* branch or tag produces its own version tag. To deploy from published
images instead of building locally, set API_IMAGE and WEB_IMAGE in
infra/docker/.env, then pull and start without building:
API_IMAGE=ghcr.io/<owner>/sapientia-api:latest
WEB_IMAGE=ghcr.io/<owner>/sapientia-web:latest
docker compose --env-file infra/docker/.env -f infra/docker/docker-compose.yml pull api worker web migrate
docker compose --env-file infra/docker/.env -f infra/docker/docker-compose.yml up -d --no-build --force-recreateAfter signing in, configure your MinerU token and LLM API key in
/settings; user credentials are stored encrypted using ENCRYPTION_KEY.
The same Settings page includes Memory & personalization controls for the
compiled research memory and interaction profile. Platform admins use
/admin-settings for private beta invites and user access review.
Kubernetes manifests live in infra/k8s. raw/ contains beginner-friendly
plain YAML files, while kustomize/ contains reusable base resources plus dev
and production overlays. Both deploy the same stack as Compose: web, API,
worker, migration job, Postgres, Redis, and RustFS/S3-compatible object storage.
See infra/k8s/README.md for the raw flow, secret setup,
dev overlay, and production overlay flow.
# Toolchain
corepack enable && corepack prepare pnpm@latest --activate
curl -fsSL https://bun.sh/install | bash # Bun ≥ 1.2
brew install colima docker docker-compose && colima start # macOS
# Project
pnpm install
cp apps/api/.env.example apps/api/.env
cp packages/db/.env.example packages/db/.env
docker compose -f infra/docker/docker-compose.yml up -d postgres redis object-storage object-storage-init
pnpm db:migrate # better-auth + app schema
pnpm dev:api # http://localhost:3000 → /health
pnpm dev:web # http://localhost:5173 → sign in
pnpm worker:dev # BullMQ worker for parse, enrich, graph, and memory jobs/health returns 200 {status:"ok",db:"connected",redis:"connected",s3:"connected"} when every dependency is up, 503 {status:"degraded",...} otherwise.
After signing in, configure your MinerU token and LLM API key in
/settings. Uploaded PDFs queue for parsing; progress shows in the library badge
as parsing N/M. Open a parsed paper for the side-by-side reader; click "New
note" for the three-pane reading + writing layout. When you add notes,
highlights, annotations, or note-native Ask turns, the worker can compile
research memory in the background.
In development, authenticated dev endpoints expose the agent memory state:
/api/v1/dev/memory-trace?workspaceId=...&paperId=...
/api/v1/dev/agent-context-preview?workspaceId=...&paperId=...&question=...
/api/v1/dev/research-memory-capsule?workspaceId=...
/api/v1/dev/interaction-profile?workspaceId=...
User-facing memory controls live under:
/api/v1/memory/settings?workspaceId=...
/api/v1/memory/research?workspaceId=...
/api/v1/memory/interaction-profile?workspaceId=...
| Layer | Stack |
|---|---|
| Frontend | React 19 · TypeScript (strict) · Vite · Tailwind v4 · shadcn/ui · Zustand · TanStack Query/Router · Tiptap (via Novel) · PDF.js · react-force-graph-3d |
| Backend | Bun ≥ 1.2 · Hono · Drizzle ORM · Zod · BullMQ · better-auth · AWS SDK v3 |
| Data | PostgreSQL 16 + pgvector · Redis 7 · RustFS / S3-compatible object storage |
| External | MinerU (PDF parsing) · Anthropic or OpenAI (LLM) |
| Tooling | pnpm workspaces · Biome · vitest · Playwright · testcontainers-node · Docker Compose · Kustomize |
apps/
web/ React frontend (Vite)
api/ Hono backend (Bun) — routes, services, BullMQ workers
packages/
shared/ Zod schemas, types, prompts shared across the stack
db/ Drizzle schema, migrations, client
infra/
docker/ docker-compose for full self-hosted stack
k8s/ Kustomize manifests (base + dev/prod overlays)
docs/ PRD, ADRs, deployment runbook, design tokens, task cards
demo/ Logo + landing assets
| Command | What it does |
|---|---|
pnpm dev:web |
Vite dev server |
pnpm dev:api |
bun --hot Hono server |
pnpm worker:dev |
BullMQ worker (paper parse/enrich/compile, graph, memory) |
pnpm infra:up / infra:down / infra:logs |
Docker Compose full stack |
pnpm db:generate |
Drizzle Kit — diff schema files into a new migration |
pnpm db:migrate |
Apply migrations against DATABASE_URL |
pnpm db:studio |
Drizzle Studio web UI |
pnpm typecheck |
TypeScript across the workspace |
pnpm run lint |
biome check . (use pnpm run — pnpm 10 reserves pnpm lint) |
pnpm format |
biome format --write . |
pnpm build |
Build all packages |
pnpm test |
Vitest across the workspace (testcontainers for backend integration) |
pnpm test runs vitest across the workspace. The web tests cover auth flow + reader components; the API tests use testcontainers-node for ephemeral Postgres + Redis + S3-compatible object storage and need a working Docker socket. The vitest config auto-discovers colima / Docker-Desktop / standard sockets and sets DOCKER_HOST for you. If your Docker socket lives somewhere unusual, export DOCKER_HOST=unix:///path/to/docker.sock first.
Email/password works out of the box with BETTER_AUTH_SECRET, BETTER_AUTH_URL=http://localhost:3000, and FRONTEND_ORIGIN=http://localhost:5173 set in apps/api/.env. OAuth is optional in local development.
Private beta access is controlled after authentication, so email/password,
Google OAuth, and GitHub OAuth all share the same gate. Set
INVITE_ONLY=true to make new non-admin users waitlisted until they redeem an
invite code. Set [email protected] to make those emails
platform admins on first sign-in; admins can review users and create invite
codes from /admin-settings. Admins can also update beta users' platform role
and access status from that page.
Google OAuth
- Visit Google Cloud Console and create a project.
- Enable the Google Sign-In APIs.
- Create an OAuth 2.0 Web application client.
- Add redirect URI
http://localhost:3000/api/auth/callback/google. - Set
GOOGLE_CLIENT_IDandGOOGLE_CLIENT_SECRETinapps/api/.env.
GitHub OAuth
- Visit GitHub Developer Settings → New OAuth App.
- Authorization callback URL:
http://localhost:3000/api/auth/callback/github. - Set
GITHUB_CLIENT_IDandGITHUB_CLIENT_SECRETinapps/api/.env.
If you set one value of an OAuth provider pair you must set the other — config validation rejects partial provider configuration on boot.
The web app uses TanStack Router with the Vite router plugin. File-based routes live in apps/web/src/routes; apps/web/src/routeTree.gen.ts is generated automatically during build/dev and should be committed when it changes.
In dev, Vite proxies /api/* from :5173 to :3000, so better-auth stays same-origin from the browser's perspective and auth cookies work without extra client configuration.
Current checkpoint — Reading + Agent Memory Substrate. Sign-up → upload PDF → parse via MinerU → block-addressable reader → notes/highlights/annotations → paper-local graph → note-native Ask → compiled research memory and interaction profile with user correction controls.
Sapientia is in heavy development. Open an issue before significant changes; we can discuss the best way to contribute your improvements. For now, the best way to help is to try it out and share feedback!
Please see LICENSE.
