Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
78cfc65
chore: snapshot unstaged work
bensonwong Apr 3, 2026
8c7ad59
feat(api): add deepTextPages field, deprecate deepTextPromptPortion
bensonwong Apr 3, 2026
b0f39fe
feat(prompts): deterministic deepTextPages rendering in wrapCitationP…
bensonwong Apr 3, 2026
e027a7b
feat(cli/hydrate): prefer prepare-*.json, support deepTextPages, infe…
bensonwong Apr 3, 2026
59a5394
feat(cli): add merge command for parallel-section workflows
bensonwong Apr 3, 2026
a4ecaa9
fix(drawing): always show anchor highlight when anchor and phrase bot…
bensonwong Apr 3, 2026
0ce6ae9
fix(icons): soften DeepCitationIcon bracket color to muted-foreground
bensonwong Apr 3, 2026
3dad3d7
docs: migrate deepTextPromptPortion → deepTextPages across docs, exam…
bensonwong Apr 3, 2026
383703f
chore: migrate tests and types from deepTextPromptPortion to deepText…
bensonwong Apr 3, 2026
6de6efc
fix(review): delete deepTextPromptPortion in --deep-text; safeExec fo…
bensonwong Apr 3, 2026
0690355
feat(api): replace deepTextPages/deepTextPromptPortion with deepTextP…
bensonwong Apr 3, 2026
22bdfa6
feat(cli): auto-generate citations from body markers; body-only merge…
bensonwong Apr 3, 2026
8ffa545
fix(react): move "use client" directive to post-build injection script
bensonwong Apr 3, 2026
ef83ea6
chore: formatting — import ordering, line-length splits
bensonwong Apr 3, 2026
e1d88a6
test: update tests for deepTextPagesByAttachmentId rename and legacy …
bensonwong Apr 3, 2026
4ea7e9e
docs: update docs, examples, and scripts for deepTextPagesByAttachmen…
bensonwong Apr 3, 2026
036092b
fix(ui): remove pan arrow icons from AnchorTextFocusedImage buttons
bensonwong Apr 3, 2026
a33dc06
fix: update test assertions for new error messages; biome formatting …
bensonwong Apr 3, 2026
85842d9
test(ct): update anchor highlight tests to match always-show behavior
bensonwong Apr 3, 2026
b4b13ed
fix(ui): widen drawer title, lock body scroll to prevent double scrol…
bensonwong Apr 3, 2026
13396c0
feat(ui): consolidate drawer header indicators into single StackedSta…
bensonwong Apr 3, 2026
41dd647
feat(cli): support anchor hint syntax `cite:N "anchor"` in markers an…
bensonwong Apr 3, 2026
f07bf5b
fix(cli): handle raw deepTextPages without page tags in hydrate
bensonwong Apr 3, 2026
035872c
feat(ui): add active index, size, gap, and on-page props to StackedSt…
bensonwong Apr 3, 2026
8923367
test(i18n): add toggle annotation aria keys to known unused set
bensonwong Apr 3, 2026
0acb11f
fix: address code review findings — regex consistency, a11y, NUL safety
bensonwong Apr 3, 2026
a7f5619
feat(cli): support single-quoted anchors and add generic-word stoplist
bensonwong Apr 3, 2026
01b2027
fix(cli): use full-length anchor text instead of 4-word truncation
bensonwong Apr 3, 2026
5d92369
fix(ui): move displayLabel annotation outside ClaimQuote component
bensonwong Apr 3, 2026
1b64e79
fix: address review findings — safeExec, sanitizeForLog, hoist stoplist
bensonwong Apr 3, 2026
459ab32
feat(cli): auto-gen priority, URL source map, cite reuse warning, raw…
bensonwong Apr 3, 2026
833ea96
fix(ui): inline citation wrapping, download filenames, favicon
bensonwong Apr 3, 2026
ea65247
fix: address review findings — URL validation, line ID collision, emp…
bensonwong Apr 3, 2026
426209e
fix(tests): cite: link bold labels, drawer-header-indicators, h2 trun…
bensonwong Apr 3, 2026
30e4ed7
chore: Auto-update Playwright visual snapshots [skip ci]
github-actions[bot] Apr 3, 2026
4d8d226
fix(lint): resolve biome ci violations — noNonNullAssertion, noContro…
bensonwong Apr 3, 2026
804b210
fix(security): resolve CodeQL alerts — TOCTOU in hydrate.ts and ReDoS…
bensonwong Apr 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions INTEGRATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ async function analyzeDocument(filePath: string, question: string) {

// Step 1: Prepare source
const document = readFileSync(filePath);
const { fileDataParts, deepTextPromptPortion } = await deepcitation.prepareAttachments([
const { fileDataParts, deepTextPages } = await deepcitation.prepareAttachments([
{ file: document, filename: filePath },
]);
const attachmentId = fileDataParts[0].attachmentId; // 20-char alphanumeric ID
Expand All @@ -207,7 +207,7 @@ async function analyzeDocument(filePath: string, question: string) {
const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt: "You are a helpful assistant. Cite your sources.",
userPrompt: question,
deepTextPromptPortion,
deepTextPages,
});

const response = await openai.chat.completions.create({
Expand Down Expand Up @@ -455,15 +455,15 @@ Get your API key at [deepcitation.com/signup](https://deepcitation.com/signup).

### 1.4 Prepare Sources

Upload documents to get an `attachmentId` (a **20-character alphanumeric ID**) and `deepTextPromptPortion` (structured text content used to enhance LLM prompts). Save `attachmentId` — you'll need it for verification.
Upload documents to get an `attachmentId` (a **20-character alphanumeric ID**) and `deepTextPages` (structured text content used to enhance LLM prompts). Save `attachmentId` — you'll need it for verification.

**Files:**

```typescript
import { readFileSync } from "fs";

const document = readFileSync("./document.pdf");
const { fileDataParts, deepTextPromptPortion } = await deepcitation.prepareAttachments([
const { fileDataParts, deepTextPages } = await deepcitation.prepareAttachments([
{ file: document, filename: "document.pdf" },
{ file: imageBuffer, filename: "chart.png" }, // multiple files supported
]);
Expand All @@ -475,7 +475,7 @@ const attachmentId = fileDataParts[0].attachmentId; // e.g. "a1b2c3d4e5f6g7h8i9j
**URLs:**

```typescript
const { attachmentId, deepTextPromptPortion, metadata } = await deepcitation.prepareUrl({
const { attachmentId, deepTextPages, metadata } = await deepcitation.prepareUrl({
url: "https://example.com/article",
});
```
Expand Down Expand Up @@ -527,7 +527,7 @@ import { wrapCitationPrompt } from "deepcitation";
const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt: "You are a helpful assistant...",
userPrompt: "Summarize this document",
deepTextPromptPortion, // from Section 1 — prepareAttachments or prepareUrl
deepTextPages, // from Section 1 — prepareAttachments or prepareUrl
});
```

Expand Down Expand Up @@ -807,7 +807,7 @@ const llmOutput = result.response.text();

### No citations in LLM output

- Verify `deepTextPromptPortion` is passed to `wrapCitationPrompt()`
- Verify `deepTextPages` is passed to `wrapCitationPrompt()`
- Try a different LLM model (some follow citation instructions better)
- Use `CITATION_REMINDER` for reinforcement in multi-turn conversations

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,15 @@ const deepCitation = new DeepCitation({
});

// 1) Process documents
const { deepTextPromptPortion } = await deepCitation.prepareAttachments([
const { deepTextPages } = await deepCitation.prepareAttachments([
{ file: pdfBuffer, filename: "report.pdf" },
]);

// 2) Wrap prompts before calling your model
const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt: "You are a helpful assistant...",
userPrompt: "Summarize the key findings",
deepTextPromptPortion,
deepTextPages,
});

const response = await yourLLM.chat({
Expand Down
14 changes: 10 additions & 4 deletions docs/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ For Office files (DOCX, XLSX, PPTX, ODT, ODS, ODP) and web pages, use the [URL p
| Field | Type | Description |
|:------|:-----|:------------|
| `attachmentId` | string | System-generated or custom ID for verification calls |
| `deepTextPromptPortion` | string | Formatted text with page markers and line IDs for `wrapCitationPrompt()` |
| `deepTextPages` | string[] | Raw page text returned by `prepareAttachments()` and preferred input for `wrapCitationPrompt()` |
| `status` | `"ready"` \| `"error"` | Processing status |
| `metadata` | object | File metadata (filename, mimeType, pageCount, textByteSize) |
| `processingTimeMs` | number | Processing time in milliseconds |
Expand All @@ -81,7 +81,10 @@ curl -X POST "https://api.deepcitation.com/prepareAttachments" \
```json
{
"attachmentId": "abc123-def456-ghi789",
"deepTextPromptPortion": "<page_number_1_index_0>\n<line id=\"1\">Revenue increased by 25% in Q4...</line>\n<line id=\"2\">Net profit margin improved...</line>",
"deepTextPages": [
"Revenue increased by 25% in Q4...",
"Net profit margin improved..."
],
"metadata": {
"filename": "document.pdf",
"mimeType": "application/pdf",
Expand Down Expand Up @@ -140,7 +143,10 @@ curl -X POST "https://api.deepcitation.com/prepareAttachments" \
```json
{
"attachmentId": "url-abc123-def456",
"deepTextPromptPortion": "<page_number_1_index_0>\n<line id=\"1\">Example Article Title...</line>\n<line id=\"2\">Published on March 15, 2026...</line>",
"deepTextPages": [
"Example Article Title...",
"Published on March 15, 2026..."
],
"metadata": {
"filename": "article.pdf",
"mimeType": "application/pdf",
Expand Down Expand Up @@ -287,7 +293,7 @@ Retrieve full attachment metadata by ID, including page renders, verifications,
| `pageImages` | PageImage[] | Page renders with dimensions |
| `pageImagesStatus` | string | Page image generation status |
| `verifications` | `Record<string, Verification>` | All verification results for this attachment |
| `deepTextPromptPortion` | string | Extracted text with line IDs (if available) |
| `deepTextPages` | string[] | Raw extracted page text |
| `urlSource` | object | Source URL information (for URL-based attachments) |
| `expiresAt` | string \| `"never"` | Expiration date |
| `uploadedAt` | string | Upload timestamp (ISO 8601) |
Expand Down
6 changes: 3 additions & 3 deletions docs/code-examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,16 +67,16 @@ import {
const deepcitation = new DeepCitation({ apiKey: process.env.DEEPCITATION_API_KEY });

// 1. Upload multiple documents
const { fileDataParts, deepTextPromptPortion } = await deepcitation.prepareAttachments([
const { fileDataParts, deepTextPagesByAttachmentId } = await deepcitation.prepareAttachments([
{ file: contractPdf, filename: "contract.pdf" },
{ file: invoicePdf, filename: "invoice.pdf" },
]);

// 2. Wrap prompts with combined file content
// 2. Wrap prompts with the per-attachment raw page map
const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt: "You are a document analyst that cites sources.",
userPrompt: "Compare the contract terms with the invoice amounts.",
deepTextPromptPortion, // All files combined into one string
deepTextPagesByAttachmentId,
});

// 3. Call your LLM
Expand Down
2 changes: 1 addition & 1 deletion docs/curl-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ UPLOAD_RESPONSE=$(curl -s -X POST "$BASE_URL/prepareAttachments" \

# Extract IDs from response
ATTACHMENT_ID=$(echo $UPLOAD_RESPONSE | jq -r '.attachmentId')
PROMPT_CONTENT=$(echo $UPLOAD_RESPONSE | jq -r '.deepTextPromptPortion')
PROMPT_CONTENT=$(echo $UPLOAD_RESPONSE | jq -r '.deepTextPages')

echo "Attachment ID: $ATTACHMENT_ID"
echo "Prompt content ready for LLM"
Expand Down
2 changes: 1 addition & 1 deletion docs/error-handling.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ If `getAllCitationsFromLlmOutput()` returns an empty object `{}`, check:

1. **Did you wrap the prompt?** Use `wrapCitationPrompt()` to add citation instructions to your LLM call
2. **Is the LLM following the format?** Check the raw LLM output for `<cite ... />` tags or `<<<CITATION_DATA>>>` blocks
3. **Did you pass the `deepTextPromptPortion`?** The LLM needs the source text with line IDs to cite
3. **Did you pass the `deepTextPages`?** The LLM needs the source text that `wrapCitationPrompt()` renders into citation-ready prompt text

```typescript
const citations = getAllCitationsFromLlmOutput(llmOutput);
Expand Down
4 changes: 2 additions & 2 deletions docs/frameworks/agui.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,13 +64,13 @@ export async function POST(req: Request) {
}

// 1. Prepare attachment (cached across requests)
const { fileDataParts, deepTextPromptPortion } = await dc.prepareAttachments([
const { fileDataParts, deepTextPages } = await dc.prepareAttachments([
{ file: pdfBuffer, filename: "report.pdf" },
]);

// 2. Wrap prompts
const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt, userPrompt, deepTextPromptPortion,
systemPrompt, userPrompt, deepTextPages,
});

// 3. Stream LLM response via AG-UI events
Expand Down
8 changes: 4 additions & 4 deletions docs/frameworks/express.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,13 +67,13 @@ app.post("/api/upload", upload.single("file"), async (req, res) => {
const file = req.file;
if (!file) return res.status(400).json({ error: "No file provided" });

const { fileDataParts, deepTextPromptPortion } = await dc.prepareAttachments([
const { fileDataParts, deepTextPages } = await dc.prepareAttachments([
{ file: file.buffer, filename: file.originalname },
]);

res.json({
fileDataPart: fileDataParts[0],
deepTextPromptPortion,
deepTextPages,
});
});
```
Expand All @@ -87,12 +87,12 @@ import { wrapCitationPrompt } from "deepcitation";

// POST /api/chat
app.post("/api/chat", async (req, res) => {
const { userMessage, deepTextPromptPortion } = req.body;
const { userMessage, deepTextPages } = req.body;

const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt: "You are a helpful assistant that provides cited responses.",
userPrompt: userMessage,
deepTextPromptPortion,
deepTextPages,
});

// Replace with your LLM provider (e.g. gpt-5-mini, gemini-2.0-flash-lite)
Expand Down
2 changes: 1 addition & 1 deletion docs/frameworks/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ DeepCitation is framework-agnostic. It adds two server-side steps around your ex
[your docs] → prepareAttachments() → [enhanced prompt] → [your LLM] → verifyAttachment() → [verified output]
```

1. **Before the LLM call** — `prepareAttachments()` uploads source files and returns a `deepTextPromptPortion` string you inject into your prompt
1. **Before the LLM call** — `prepareAttachments()` uploads source files and returns `deepTextPages` (raw page text) that `wrapCitationPrompt()` renders deterministically when you build the prompt
2. **After the LLM call** — `verifyAttachment()` checks citations in the LLM's response against the source, returning visual proof

The React components (`CitationComponent`, `CitationDrawer`) are client-only and optional — they render the verification results. You can use a plain text or Slack renderer instead.
22 changes: 11 additions & 11 deletions docs/frameworks/langchain.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,9 @@ async function answerWithCitations(pdfPath: string, question: string) {
const fileBuffer = readFileSync(pdfPath);

// 2. Upload to DeepCitation
// Returns deepTextPromptPortion: the document content formatted for
// citation-aware prompting, and fileDataParts for verification.
const { fileDataParts, deepTextPromptPortion } = await dc.prepareAttachments([
// Returns deepTextPages: the raw document pages for citation-aware
// prompting, and fileDataParts for verification.
const { fileDataParts, deepTextPages } = await dc.prepareAttachments([
{ file: fileBuffer, filename: pdfPath.split("/").pop()! },
]);

Expand All @@ -84,7 +84,7 @@ async function answerWithCitations(pdfPath: string, question: string) {
systemPrompt:
"You are a precise research assistant. Answer questions based only on the provided documents.",
userPrompt: question,
deepTextPromptPortion,
deepTextPages,
});

// 4. Call your LangChain model — no special DC integration needed here
Expand Down Expand Up @@ -158,7 +158,7 @@ const dc = new DeepCitation({ apiKey: process.env.DEEPCITATION_API_KEY! });
interface PipelineInput {
question: string;
// Passed in from the pre-step (document preparation)
deepTextPromptPortion: string;
deepTextPages: string[];
attachmentId: string;
}

Expand All @@ -177,7 +177,7 @@ const citationChain = RunnableSequence.from([
systemPrompt:
"You are a precise research assistant. Cite sources for every factual claim.",
userPrompt: input.question,
deepTextPromptPortion: input.deepTextPromptPortion,
deepTextPages: input.deepTextPages,
});
return {
system: enhancedSystemPrompt,
Expand All @@ -202,7 +202,7 @@ async function runCitationPipeline(
question: string,
): Promise<PipelineOutput> {
// Pre-step: prepare DC attachment (runs before the chain)
const { fileDataParts, deepTextPromptPortion } = await dc.prepareAttachments([
const { fileDataParts, deepTextPages } = await dc.prepareAttachments([
{ file: fileBuffer, filename },
]);

Expand All @@ -211,7 +211,7 @@ async function runCitationPipeline(
// Run the inner chain
const answer = await citationChain.invoke({
question,
deepTextPromptPortion,
deepTextPages,
attachmentId,
});

Expand Down Expand Up @@ -248,20 +248,20 @@ async function runCitationPipeline(

## Multiple Documents

Pass multiple files to `prepareAttachments` in a single call. DeepCitation combines them into one `deepTextPromptPortion` string:
Pass multiple files to `prepareAttachments` in a single call. DeepCitation returns a `deepTextPagesByAttachmentId` map so each attachment stays explicit and order-independent:

```typescript
import { groupCitationsByAttachmentId } from "deepcitation";

const { fileDataParts, deepTextPromptPortion } = await dc.prepareAttachments([
const { fileDataParts, deepTextPagesByAttachmentId } = await dc.prepareAttachments([
{ file: contractBuffer, filename: "contract.pdf" },
{ file: invoiceBuffer, filename: "invoice.pdf" },
]);

const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt: "You are a document analyst. Cite sources for every claim.",
userPrompt: "What are the total costs and payment terms?",
deepTextPromptPortion, // Both documents combined
deepTextPagesByAttachmentId,
});

const model = new ChatOpenAI({ model: "gpt-4o-mini" });
Expand Down
12 changes: 6 additions & 6 deletions docs/frameworks/mastra.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ const chunks = await doc.chunk({ strategy: "recursive", maxSize: 180, overlap: 3
// ... embed and upsert into LibSQLVector ...

// 2. Prepare source PDFs for DeepCitation
const { fileDataParts, deepTextPromptPortion } = await dc.prepareAttachments([
const { fileDataParts, deepTextPages } = await dc.prepareAttachments([
{ file: pdfBuffer, filename: "report.pdf" },
]);

Expand All @@ -67,7 +67,7 @@ const retrievedChunks = await vectorStore.query({ indexName: "corpus", queryVect
const { enhancedSystemPrompt, enhancedUserPrompt } = wrapCitationPrompt({
systemPrompt: "You are a research assistant that cites sources.",
userPrompt: userQuestion,
deepTextPromptPortion,
deepTextPages,
});

const response = await openai.chat.completions.create({
Expand All @@ -91,7 +91,7 @@ const { verifications } = await dc.verify({
Uploading PDFs on every request is wasteful. Cache the `attachmentId` and reuse it:

```typescript
const cache = new Map<string, Promise<{ attachmentId: string; deepTextPromptPortion: string }>>();
const cache = new Map<string, Promise<{ attachmentId: string; deepTextPages: string[] }>>();

async function getAttachment(source: { id: string; url: string; filename: string }) {
const existing = cache.get(source.id);
Expand All @@ -102,8 +102,8 @@ async function getAttachment(source: { id: string; url: string; filename: string
const savedId = process.env[`DEEPCITATION_ATTACHMENT_${source.id.toUpperCase()}`];
if (savedId) {
const attachment = await dc.getAttachment(savedId);
if (attachment.deepTextPromptPortion) {
return { attachmentId: savedId, deepTextPromptPortion: attachment.deepTextPromptPortion };
if (attachment.deepTextPages?.length) {
return { attachmentId: savedId, deepTextPages: attachment.deepTextPages };
}
}

Expand All @@ -112,7 +112,7 @@ async function getAttachment(source: { id: string; url: string; filename: string
const prepared = await dc.prepareAttachments([{ file, filename: source.filename }]);
return {
attachmentId: prepared.fileDataParts[0].attachmentId,
deepTextPromptPortion: prepared.deepTextPromptPortion,
deepTextPages: prepared.deepTextPages,
};
})();

Expand Down
Loading
Loading