Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Pydantic Field metadata causes invalid JSON schema in OpenAI Structured Outputs #2024

Closed
@KanchiShimono

Description

@KanchiShimono

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

When using Pydantic’s Field to include metadata such as title or description in nested models, the generated JSON schema does not properly set additionalProperties: false for $ref-referenced types when they are inlined. This causes a BadRequestError (400) from the API, with the following error message:

BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'Universe': In context=('properties', 'largest_star'), 'additionalProperties' is required to be supplied and to be false.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}

The root cause is that when nested objects are referenced using $defs in the JSON schema and subsequently inlined via $ref, the additionalProperties: false setting is omitted, which violates the expected strict schema requirements.

The JSON schema generated from the Universe class used in the code snippet below looks like this. (additionalProperties: false is not set for Galaxy.largest_star.)

{
  "$defs": {
    "Galaxy": {
      "properties": {
        "name": {
          "description": "The name of the galaxy.",
          "title": "Name",
          "type": "string"
        },
        "largest_star": {
          # `"additionalProperties": false` is missing.
          "description": "The largest star in the galaxy.",
          "properties": {
            "name": {
              "description": "The name of the star.",
              "title": "Name",
              "type": "string"
            }
          },
          "required": ["name"],
          "title": "Star",
          "type": "object"
        }
      },
      "required": ["name", "largest_star"],
      "title": "Galaxy",
      "type": "object",
      "additionalProperties": false
    },
    "Star": {
      "properties": {
        "name": {
          "description": "The name of the star.",
          "title": "Name",
          "type": "string"
        }
      },
      "required": ["name"],
      "title": "Star",
      "type": "object",
      "additionalProperties": false
    }
  },
  "properties": {
    "name": {
      "description": "The name of the universe.",
      "title": "Name",
      "type": "string"
    },
    "galaxy": {
      "description": "A galaxy in the universe.",
      "properties": {
        "name": {
          "description": "The name of the galaxy.",
          "title": "Name",
          "type": "string"
        },
        "largest_star": {
          "description": "The largest star in the galaxy.",
          "properties": {
            "name": {
              "description": "The name of the star.",
              "title": "Name",
              "type": "string"
            }
          },
          "required": ["name"],
          "title": "Star",
          "type": "object"
        }
      },
      "required": ["name", "largest_star"],
      "title": "Galaxy",
      "type": "object",
      "additionalProperties": false
    }
  },
  "required": ["name", "galaxy"],
  "title": "Universe",
  "type": "object",
  "additionalProperties": false
}

This issue occurs when using the AzureOpenAI client and is likely reproducible with the standard OpenAI client as well, as they both share the underlying schema handling mechanism.

To Reproduce

  1. Initialize an AzureOpenAI client (or an OpenAI client).
  2. Create a nested Pydantic model structure with Field metadata (title and description).
  3. Pass the top-level model as the response_format argument.
  4. Observe the 400 error indicating the missing additionalProperties: false for nested objects.

Code snippets

from typing import Annotated

from openai import AzureOpenAI
from pydantic import BaseModel, Field


class Star(BaseModel):
    name: Annotated[str, Field(description="The name of the star.")]


class Galaxy(BaseModel):
    name: Annotated[str, Field(description="The name of the galaxy.")]
    largest_star: Annotated[Star, Field(description="The largest star in the galaxy.")]


class Universe(BaseModel):
    name: Annotated[str, Field(description="The name of the universe.")]
    galaxy: Annotated[Galaxy, Field(description="A galaxy in the universe.")]


client = AzureOpenAI(azure_endpoint="endpoint", api_key="api-key", api_version="api-version")

prompt = "Create a fictional universe for a science fiction novel."
completion = client.beta.chat.completions.parse(
    messages=[
        {"role": "user", "content": prompt},
    ],
    model="gpt-4o",
    response_format=Universe,
)

OS

macOS

Python version

Python v3.12.8

Library version

openai v1.59.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions