feat: support structured outputs in `OpenAIChatGenerator` #9754

Amnah199 · 2025-09-02T08:59:34Z

Related Issues

fixes Option to enable structured outputs with OpenAI Generators #8276

Proposed Changes:

Add support for structured outputs using response_format in OpenAIChatGenerator and AzureOpenAIChatGenerator.

How did you test it?

Add tests for response_format using pydantic model and json schema.

Notes for the reviewer

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
I documented my code
I ran pre-commit hooks and fixed any issue

coveralls · 2025-09-02T09:03:31Z

Pull Request Test Coverage Report for Build 17578922709

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
39 unchanged lines in 6 files lost coverage.
Overall coverage decreased (-0.003%) to 92.063%

Files with Coverage Reduction	New Missed Lines	%
core/pipeline/utils.py	1	98.53%
tools/component_tool.py	2	93.83%
components/generators/chat/azure.py	4	93.44%
components/generators/chat/openai.py	7	96.2%
core/super_component/super_component.py	8	95.92%
core/pipeline/breakpoint.py	17	86.67%

Totals
Change from base Build 17465550683:	-0.003%
Covered Lines:	12992
Relevant Lines:	14112

💛 - Coveralls

haystack/components/generators/chat/openai.py

test/components/generators/chat/test_openai.py

releasenotes/notes/add-openai-structured-outputs-906be78a984a1079.yaml

haystack/components/generators/chat/openai.py

anakin87

I left some comments.

I also suggest merging main to see the actual coverage. My impression is that several new code paths are not covered by unit tests. I would like to have them covered, since this component is crucial.

anakin87 · 2025-09-04T14:03:22Z

haystack/components/generators/chat/openai.py

+        if response_format:
+            if is_streaming:
+                raise ValueError(
+                    "OpenAI does not support `streaming_callback` with `response_format`, please choose one."


I would like to understand better. It seems that OpenAI supports streaming + structured outputs.
If we are making this choice for simplicity reasons, I would be more specific: "The OpenAIChatGenerator does not ..."

Hmm interesting. I think because of the beta code example in documentation I misunderstood that its an unstable feature (not available in completions API). But you are right its supported. I’ll update the function to enable this.

I did not notice that it was beta

It might also be reasonable to skip it if it requires rewriting significantly the component. (Unfortunately, we have many other components depending on this implementation)

If there are stable ways to do this, let's do it. Otherwise, let's create an issue to track this once that API is no longer in beta.

I looked into their stream function and here looks like the stable version also supports response_format.

I will test this locally and will update the code.

@anakin87 Looked into this and here are some points:

The response_format with streaming will be passed to chat.completions.create. This endpoint allows response_format to be either a json schema or { "type": "json_object" }. But it cannot be a pydantic.BaseModel.

From the documentation, it seems like the beta version supports pydantic models with stream endpoint. Which I dont want to introduce for the reasons you mentioned above.

So for now, I believe we can support the first point and mention the limitation in docstrings. Anyways, the error is handled by OpenAI itself if the user passes a pydantic model with streaming.

Agree... Let's do what you propose and mention in docstrings that Pydantic response_format won't work with streaming.

releasenotes/notes/add-openai-structured-outputs-906be78a984a1079.yaml

test/components/generators/chat/test_openai.py

test/components/generators/chat/test_azure.py

test/components/generators/chat/test_openai.py

…t-ai/haystack into openai-structured-outputs

anakin87

I like the progress (unit tests still missing)

(Since I'll be off, once the PR is ready, feel free to dismiss my review as stale and let David approve)

releasenotes/notes/add-openai-structured-outputs-906be78a984a1079.yaml

haystack/components/generators/chat/azure.py

haystack/components/generators/chat/openai.py

releasenotes/notes/add-openai-structured-outputs-906be78a984a1079.yaml

anakin87 · 2025-09-05T15:11:24Z

haystack/components/generators/chat/openai.py

+        if "stream" in api_args.keys():
+            chat_completion = self.client.chat.completions.create(**api_args)


I find this hard to understand. Something similar to this would work?

We could always return a dictionary with the same fields from _prepare_api_call.

if api_args.get("response_format"): # We cannot pass stream param to chat.completions.parse endpoint api_args.pop("stream", None) chat_completion = self.client.chat.completions.parse(**api_args) else: ...

(I might be wrong, in any case I'd appreciate it if we can make the code more intuitive to follow.)

Hmm we allow passing response_format with stream param to create endpoint, for streaming structured outputs so this wont work.

Ok I now understand, but it's hard to follow.

What I would recommend is to

include in api_args an item called endpoint/method containing "parse" or "create" (in _prepare_api_call)

(add comments where this value is set to explain why we are doing that.)

reuse the value in run

haystack/components/generators/chat/azure.py

haystack/components/generators/chat/openai.py

davidsbatista · 2025-09-08T13:19:07Z

haystack/components/generators/chat/openai.py

+        if openai_endpoint == "create":
+            chat_completion = await self.async_client.chat.completions.create(**api_args)
+        elif openai_endpoint == "parse":
+            chat_completion = await self.async_client.chat.completions.parse(**api_args)


I think we should add an extra else here that raises an exception in case an unexpected value is popped? It would avoid all the typing issues below

I updated the condition to use parse if its passed as the endpoint, else always use create which was the case before.

test/components/generators/chat/test_openai.py

sjrl · 2025-09-09T06:58:53Z

haystack/components/generators/chat/azure.py

@@ -5,7 +5,9 @@
 import os
 from typing import Any, Optional, Union

+from openai.lib._pydantic import to_strict_json_schema


Is there a different way to import this function that doesn't go through a private file? I'm a little worried the import path is subject to break/change

Hmm, for now seems like this is the only way to import this.

hmm I think we can do this a different way. We should be able to directly use the one from pydantic which looks like this parameters_schema = model.model_json_schema(). This is from the _create_tool_parameters_schema function in ComponentTool

OpenAI expect a stricter JSON Schema than Pydantic’s default. For example, objects must set additionalProperties and Optional keys are handled differently. As a result, model_json_schema() often isn’t accepted as-is.
Its also discussed here where another solution is offered but I prefer using the openai method over some unpopular library.

Nevertheless, I spotted a bug in to_dict where schema wasn't stored properly. Fixing this.

ahh okay thanks for the info

Add parse for response format

bf363bb

github-actions bot added the type:documentation Improvements on the docs label Sep 2, 2025

Amnah199 added 2 commits September 2, 2025 11:10

Update response_format

df844b4

Add tests

cb5f2dc

github-actions bot added the topic:tests label Sep 2, 2025

Amnah199 added 2 commits September 2, 2025 14:53

Add release notes

f54826d

Update checks

4bc8b65

Amnah199 marked this pull request as ready for review September 2, 2025 13:45

Amnah199 requested a review from a team as a code owner September 2, 2025 13:45

Amnah199 requested review from davidsbatista and removed request for a team September 2, 2025 13:45

anakin87 reviewed Sep 2, 2025

View reviewed changes

haystack/components/generators/chat/openai.py Outdated Show resolved Hide resolved

davidsbatista reviewed Sep 2, 2025

View reviewed changes

test/components/generators/chat/test_openai.py Outdated Show resolved Hide resolved

davidsbatista reviewed Sep 2, 2025

View reviewed changes

releasenotes/notes/add-openai-structured-outputs-906be78a984a1079.yaml Outdated Show resolved Hide resolved

davidsbatista reviewed Sep 2, 2025

View reviewed changes

releasenotes/notes/add-openai-structured-outputs-906be78a984a1079.yaml Outdated Show resolved Hide resolved

davidsbatista reviewed Sep 2, 2025

View reviewed changes

haystack/components/generators/chat/openai.py Outdated Show resolved Hide resolved

anakin87 reviewed Sep 2, 2025

View reviewed changes

haystack/components/generators/chat/openai.py Outdated Show resolved Hide resolved

Amnah199 added 4 commits September 3, 2025 16:24

remove instance var

1b5ed82

Add tests for azure

02d3d59

Add schema test

060c4f0

Add comments

c2d41fe

Amnah199 requested review from anakin87 and davidsbatista September 4, 2025 09:47

Amnah199 mentioned this pull request Sep 4, 2025

Investigate structured outputs support in OpenAI-based integrations #9761

Open

6 tasks

anakin87 requested changes Sep 4, 2025

View reviewed changes

Amnah199 added 4 commits September 4, 2025 16:26

Merge branch 'main' into openai-structured-outputs

26533a2

Add streaming support

4c0069c

Merge branch 'openai-structured-outputs' of https://github.com/deepse…

0600176

…t-ai/haystack into openai-structured-outputs

PR comments

b123b9a

anakin87 mentioned this pull request Sep 5, 2025

Reducing coupling between OpenAIChatGenerator and core integrations #9769

Open

anakin87 reviewed Sep 5, 2025

View reviewed changes

Amnah199 added 4 commits September 8, 2025 09:27

PR comments

cde0714

Add tests

2b4a0ff

Fix tests

0dbe1fd

Add unit tests

52401d6

davidsbatista reviewed Sep 8, 2025

View reviewed changes

haystack/components/generators/chat/azure.py Outdated Show resolved Hide resolved

davidsbatista reviewed Sep 8, 2025

View reviewed changes

haystack/components/generators/chat/azure.py Outdated Show resolved Hide resolved

davidsbatista reviewed Sep 8, 2025

View reviewed changes

haystack/components/generators/chat/openai.py Outdated Show resolved Hide resolved

davidsbatista reviewed Sep 8, 2025

View reviewed changes

haystack/components/generators/chat/openai.py Outdated Show resolved Hide resolved

Update Azure files

333b9e8

davidsbatista reviewed Sep 8, 2025

View reviewed changes

haystack/components/generators/chat/openai.py Outdated Show resolved Hide resolved

Amnah199 added 2 commits September 8, 2025 15:07

PR comments

6972d77

Small fix

5d50f0a

davidsbatista reviewed Sep 8, 2025

View reviewed changes

test/components/generators/chat/test_openai.py Outdated Show resolved Hide resolved

sjrl reviewed Sep 9, 2025

View reviewed changes

Amnah199 added 4 commits September 9, 2025 10:19

Include message.parsed

762dca4

Fix seriliaztion

95707c7

Update the async method

c7476d4

Update release notes

6c8988d

Amnah199 mentioned this pull request Sep 9, 2025

Support serialization of Pydantic models in ChatMessage #9776

Open

Loosen tests to prevent failure

868faf1

		if "stream" in api_args.keys():
		chat_completion = self.client.chat.completions.create(**api_args)

feat: support structured outputs in OpenAIChatGenerator #9754

Are you sure you want to change the base?

feat: support structured outputs in OpenAIChatGenerator #9754

Uh oh!

Conversation

Amnah199 commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

Uh oh!

coveralls commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 17578922709

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anakin87 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anakin87 Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anakin87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anakin87 Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feat: support structured outputs in `OpenAIChatGenerator` #9754

feat: support structured outputs in `OpenAIChatGenerator` #9754

Amnah199 commented Sep 2, 2025 •

edited

Loading

coveralls commented Sep 2, 2025 •

edited

Loading

anakin87 left a comment •

edited

Loading

anakin87 Sep 4, 2025 •

edited

Loading

anakin87 Sep 5, 2025 •

edited

Loading

Amnah199 Sep 9, 2025 •

edited

Loading