Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .changeset/activity-observers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'@tanstack/ai': minor
---

Add activity-agnostic observability to the media activities through the unified middleware system (#720). `generateImage`, `generateVideo`, `generateAudio`, `generateSpeech`, and `generateTranscription` now accept a `middleware` option taking `GenerationMiddleware`s — the base, activity-agnostic contract whose lifecycle hooks (`onStart` / `onUsage` / `onFinish` / `onAbort` / `onError`) receive a `GenerationMiddlewareContext`. `ChatMiddleware` is a superset of this base, so a single `otelMiddleware()` value satisfies both and can be passed to `chat()` and any media activity alike. Like chat middleware, hooks are awaited in order and propagate exceptions (a throwing hook surfaces, rather than being silently swallowed).

`otelMiddleware()` (on the existing `@tanstack/ai/middlewares/otel` subpath) now emits one `gen_ai.*` span per media call, tagged with the correct `gen_ai.operation.name` (`image_generation`, `video_generation`, `audio_generation`, `text_to_speech`, `transcription`), reusing the same `gen_ai.usage.*` attribute set as chat — now including `tanstack.ai.usage.units_billed` for unit-billed media. With a `Meter` it records the `gen_ai.client.operation.duration` histogram per activity. An abandoned streaming-video consumer ends the span via `onAbort` (status `ERROR`, `tanstack.ai.completion.reason = cancelled`) instead of leaking it. The `GenerationMiddleware` types are exported from the package root; the `otelMiddleware` value stays on the subpath so importing `@tanstack/ai` never requires the optional `@opentelemetry/api` peer.
38 changes: 38 additions & 0 deletions docs/advanced/otel.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,44 @@ otelMiddleware({
})
```

## Beyond chat: media activities

`otelMiddleware` is not chat-only. The media activities — `generateImage`, `generateVideo`, `generateAudio`, `generateSpeech`, and `generateTranscription` — accept the **same** `otelMiddleware` value on their `middleware` option. Each is a single request → response (or submit → poll for video), so the middleware emits one span per call instead of the chat span tree:

```ts
import { generateImage } from '@tanstack/ai'
import { otelMiddleware } from '@tanstack/ai/middlewares/otel'
import { openaiImage } from '@tanstack/ai-openai'
import { trace, metrics } from '@opentelemetry/api'

const otel = otelMiddleware({
tracer: trace.getTracer('my-app'),
meter: metrics.getMeter('my-app'),
})

const result = await generateImage({
adapter: openaiImage('gpt-image-2'),
prompt: 'A serene mountain landscape at sunset',
middleware: [otel],
})
```

The same `otel` value can be passed to `chat()` and to any media activity — its shared lifecycle hooks (`onStart` / `onUsage` / `onFinish` / `onAbort` / `onError`) are authored against the activity-agnostic `GenerationMiddlewareContext`, so the one instance works everywhere.

Each media call produces one `CLIENT` span tagged with the activity's `gen_ai.operation.name`:

| Activity | `gen_ai.operation.name` |
| --- | --- |
| `generateImage` | `image_generation` |
| `generateVideo` | `video_generation` |
| `generateAudio` | `audio_generation` |
| `generateSpeech` | `text_to_speech` |
| `generateTranscription` | `transcription` |

The span carries `gen_ai.system` and `gen_ai.request.model` at start and, on finish, the same `gen_ai.usage.*` / `tanstack.ai.usage.*` attributes documented above — including `tanstack.ai.usage.units_billed` for unit-billed media. When a `Meter` is supplied it records the `gen_ai.client.operation.duration` histogram, tagged per activity. For streaming video the span covers the full create → poll → complete lifecycle; for non-streaming `generateVideo` it covers job submission. If a streaming video consumer abandons the stream before completion, the span is ended via `onAbort` (status `ERROR`, `tanstack.ai.completion.reason = cancelled`) rather than leaked.

`otelMiddleware` applies the same `spanNameFormatter`, `attributeEnricher`, `onBeforeSpanStart`, and `onSpanEnd` extension points to media spans — the span info is discriminated by `kind`, where media spans report `kind: 'generation'`. For a custom backend, implement the base `GenerationMiddleware` contract directly; its hooks (`onStart` / `onUsage` / `onFinish` / `onAbort` / `onError`) receive the `GenerationMiddlewareContext` and fire for every activity, chat included. The `GenerationMiddleware` types are exported from the package root, while the `otelMiddleware` value lives on the `@tanstack/ai/middlewares/otel` subpath so importing `@tanstack/ai` never requires the optional `@opentelemetry/api` peer.

## Related

- [Middleware](./middleware) — the lifecycle this middleware hooks into
Expand Down
2 changes: 1 addition & 1 deletion docs/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -287,7 +287,7 @@
"label": "OpenTelemetry",
"to": "advanced/otel",
"addedAt": "2026-05-08",
"updatedAt": "2026-06-11"
"updatedAt": "2026-06-17"
}
]
},
Expand Down
1 change: 1 addition & 0 deletions packages/ai/src/activities/chat/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,7 @@ class TextEngine<
this.deferredPromises.push(promise)
},
// Provider / adapter info
activity: 'chat',
provider: config.adapter.name,
model: config.params.model,
source: 'server',
Expand Down
7 changes: 7 additions & 0 deletions packages/ai/src/activities/chat/middleware/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,13 @@ export interface ChatMiddlewareContext<TContext = unknown> {

// --- Provider / adapter info (immutable for the lifetime of the request) ---

/**
* Which activity this context describes — always `'chat'`. Present so the
* chat context structurally satisfies the base `GenerationMiddlewareContext`,
* letting an observe-only middleware authored against the base (e.g.
* `otelMiddleware`) run on both chat and media activities.
*/
activity: 'chat'
/** Provider name (e.g., 'openai', 'anthropic') */
provider: string
/** Model identifier (e.g., 'gpt-4o') */
Expand Down
43 changes: 42 additions & 1 deletion packages/ai/src/activities/generateAudio/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,16 @@
import { aiEventClient } from '@tanstack/ai-event-client'
import { streamGenerationResult } from '../stream-generation-result.js'
import { resolveDebugOption } from '../../logger/resolve'
import {
createGenerationContext,
runGenerationError,
runGenerationFinish,
runGenerationStart,
runGenerationUsage,
} from '../middleware'
import type { InternalLogger } from '../../logger/internal-logger'
import type { DebugOption } from '../../logger/types'
import type { GenerationMiddleware } from '../middleware'
import type { AudioAdapter } from './adapter'
import type { AudioGenerationResult, StreamChunk } from '../../types'

Expand Down Expand Up @@ -70,6 +78,12 @@ export interface AudioActivityOptions<
* control and/or a custom `Logger`.
*/
debug?: DebugOption
/**
* Observe-only middleware notified on start, usage, success, and error. Pass
* `otelMiddleware()` to emit OpenTelemetry spans, or implement the
* `GenerationMiddleware` contract for a custom backend.
*/
middleware?: Array<GenerationMiddleware>
}

// ===========================
Expand Down Expand Up @@ -135,7 +149,13 @@ async function runGenerateAudio<
>(
options: AudioActivityOptions<TAdapter, boolean>,
): Promise<AudioGenerationResult> {
const { adapter, stream: _stream, debug: _debug, ...rest } = options
const {
adapter,
stream: _stream,
debug: _debug,
middleware,
...rest
} = options
const model = adapter.model
const requestId = createId('audio')
const startTime = Date.now()
Expand All @@ -145,6 +165,17 @@ async function runGenerateAudio<
(adapter as { name?: string }).name ??
'unknown'

const mwCtx = createGenerationContext({
requestId,
activity: 'audio',
provider: adapter.name,
model,
modelOptions: rest.modelOptions,
createId,
})

await runGenerationStart(middleware, mwCtx)

aiEventClient.emit('audio:request:started', {
requestId,
provider: adapter.name,
Expand Down Expand Up @@ -189,6 +220,12 @@ async function runGenerateAudio<
audioDuration: result.audio.duration,
})

if (result.usage) await runGenerationUsage(middleware, mwCtx, result.usage)
await runGenerationFinish(middleware, mwCtx, {
duration: elapsedMs,
usage: result.usage,
})

return result
} catch (error) {
const elapsedMs = Date.now() - startTime
Expand All @@ -202,6 +239,10 @@ async function runGenerateAudio<
modelOptions: rest.modelOptions as Record<string, unknown> | undefined,
timestamp: Date.now(),
})
await runGenerationError(middleware, mwCtx, {
error,
duration: elapsedMs,
})
logger.errors('generateAudio activity failed', {
error,
source: 'generateAudio',
Expand Down
43 changes: 42 additions & 1 deletion packages/ai/src/activities/generateImage/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,17 @@
import { aiEventClient } from '@tanstack/ai-event-client'
import { streamGenerationResult } from '../stream-generation-result.js'
import { resolveDebugOption } from '../../logger/resolve'
import {
createGenerationContext,
runGenerationError,
runGenerationFinish,
runGenerationStart,
runGenerationUsage,
} from '../middleware'
import { resolveMediaPrompt } from '../../utilities/media-prompt'
import type { InternalLogger } from '../../logger/internal-logger'
import type { DebugOption } from '../../logger/types'
import type { GenerationMiddleware } from '../middleware'
import type { ImageAdapter } from './adapter'
import type {
ImageGenerationResult,
Expand Down Expand Up @@ -123,6 +131,12 @@ export type ImageActivityOptions<
* control and/or a custom `Logger`.
*/
debug?: DebugOption
/**
* Observe-only middleware notified on start, usage, success, and error. Pass
* `otelMiddleware()` to emit OpenTelemetry spans, or implement the
* `GenerationMiddleware` contract for a custom backend.
*/
middleware?: Array<GenerationMiddleware>
} & ({} extends ImageProviderOptionsForModel<TAdapter, TAdapter['model']>
? {
/** Provider-specific options for image generation */ modelOptions?: ImageProviderOptionsForModel<
Expand Down Expand Up @@ -228,12 +242,29 @@ async function runGenerateImage<
>(
options: ImageActivityOptions<TAdapter, boolean>,
): Promise<ImageGenerationResult> {
const { adapter, stream: _stream, debug: _debug, ...rest } = options
const {
adapter,
stream: _stream,
debug: _debug,
middleware,
...rest
} = options
const model = adapter.model
const requestId = createId('image')
const startTime = Date.now()
const logger: InternalLogger = resolveDebugOption(options.debug)

const mwCtx = createGenerationContext({
requestId,
activity: 'image',
provider: adapter.name,
model,
modelOptions: rest.modelOptions,
createId,
})

await runGenerationStart(middleware, mwCtx)

// Devtools events carry the flattened prompt text plus media-part counts —
// the wire payload stays `prompt: string` regardless of the prompt shape.
const resolved = resolveMediaPrompt(rest.prompt)
Expand Down Expand Up @@ -299,8 +330,18 @@ async function runGenerateImage<
count: result.images.length,
})

if (result.usage) await runGenerationUsage(middleware, mwCtx, result.usage)
await runGenerationFinish(middleware, mwCtx, {
duration,
usage: result.usage,
})

return result
} catch (error) {
await runGenerationError(middleware, mwCtx, {
error,
duration: Date.now() - startTime,
})
logger.errors('generateImage activity failed', {
error,
source: 'generateImage',
Expand Down
43 changes: 42 additions & 1 deletion packages/ai/src/activities/generateSpeech/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,16 @@
import { aiEventClient } from '@tanstack/ai-event-client'
import { streamGenerationResult } from '../stream-generation-result.js'
import { resolveDebugOption } from '../../logger/resolve'
import {
createGenerationContext,
runGenerationError,
runGenerationFinish,
runGenerationStart,
runGenerationUsage,
} from '../middleware'
import type { InternalLogger } from '../../logger/internal-logger'
import type { DebugOption } from '../../logger/types'
import type { GenerationMiddleware } from '../middleware'
import type { TTSAdapter } from './adapter'
import type { StreamChunk, TTSResult } from '../../types'

Expand Down Expand Up @@ -73,6 +81,12 @@ export interface TTSActivityOptions<
* control and/or a custom `Logger`.
*/
debug?: DebugOption
/**
* Observe-only middleware notified on start, usage, success, and error. Pass
* `otelMiddleware()` to emit OpenTelemetry spans, or implement the
* `GenerationMiddleware` contract for a custom backend.
*/
middleware?: Array<GenerationMiddleware>
}

// ===========================
Expand Down Expand Up @@ -143,7 +157,13 @@ export function generateSpeech<
async function runGenerateSpeech<
TAdapter extends TTSAdapter<string, TTSProviderOptions<TAdapter>>,
>(options: TTSActivityOptions<TAdapter, boolean>): Promise<TTSResult> {
const { adapter, stream: _stream, debug: _debug, ...rest } = options
const {
adapter,
stream: _stream,
debug: _debug,
middleware,
...rest
} = options
const model = adapter.model
const requestId = createId('speech')
const startTime = Date.now()
Expand All @@ -153,6 +173,17 @@ async function runGenerateSpeech<
(adapter as { name?: string }).name ??
'unknown'

const mwCtx = createGenerationContext({
requestId,
activity: 'tts',
provider: adapter.name,
model,
modelOptions: rest.modelOptions,
createId,
})

await runGenerationStart(middleware, mwCtx)

aiEventClient.emit('speech:request:started', {
requestId,
provider: adapter.name,
Expand Down Expand Up @@ -202,6 +233,12 @@ async function runGenerateSpeech<
contentType: result.contentType,
})

if (result.usage) await runGenerationUsage(middleware, mwCtx, result.usage)
await runGenerationFinish(middleware, mwCtx, {
duration,
usage: result.usage,
})

return result
} catch (error) {
const duration = Date.now() - startTime
Expand All @@ -215,6 +252,10 @@ async function runGenerateSpeech<
modelOptions: rest.modelOptions as Record<string, unknown> | undefined,
timestamp: Date.now(),
})
await runGenerationError(middleware, mwCtx, {
error,
duration,
})
logger.errors('generateSpeech activity failed', {
error,
source: 'generateSpeech',
Expand Down
Loading
Loading