Skip to content

fix(structured outputs): resolve memory leak in parse methods#2860

Open
karpetrosyan wants to merge 7 commits intoopenai:nextfrom
karpetrosyan:fix-parse-memory-leak
Open

fix(structured outputs): resolve memory leak in parse methods#2860
karpetrosyan wants to merge 7 commits intoopenai:nextfrom
karpetrosyan:fix-parse-memory-leak

Conversation

@karpetrosyan
Copy link
Collaborator

@karpetrosyan karpetrosyan commented Feb 11, 2026

This PR avoids using runtime generics with Pydantic, which can leak memory, by using a simpler implementation for parsing the response message.
It also fixes generic conflicts in the tests, which happen because of how Pydantic v1 and Pydantic v2 implement __repr__.

Leak results in the main
iter=0  RSS=68.2 MB
iter=20  RSS=71.5 MB
iter=40  RSS=74.6 MB
iter=60  RSS=77.8 MB
iter=80  RSS=81.0 MB
iter=100  RSS=84.2 MB
iter=120  RSS=87.4 MB
iter=140  RSS=89.2 MB
iter=160  RSS=90.0 MB
iter=180  RSS=88.2 MB
iter=200  RSS=88.8 MB
iter=220  RSS=89.3 MB
iter=240  RSS=89.6 MB
iter=260  RSS=90.1 MB
iter=280  RSS=90.5 MB
iter=300  RSS=91.0 MB
iter=320  RSS=91.5 MB
iter=340  RSS=92.2 MB
iter=360  RSS=92.5 MB
iter=380  RSS=92.9 MB
iter=400  RSS=93.3 MB
iter=420  RSS=93.8 MB
iter=440  RSS=94.1 MB
iter=460  RSS=94.4 MB
iter=480  RSS=94.7 MB
iter=500  RSS=95.2 MB
iter=520  RSS=95.8 MB
iter=540  RSS=96.3 MB
iter=560  RSS=96.8 MB
iter=580  RSS=97.5 MB
iter=600  RSS=97.8 MB
iter=620  RSS=100.5 MB
iter=640  RSS=100.8 MB
iter=660  RSS=101.1 MB
iter=680  RSS=101.4 MB
iter=700  RSS=101.7 MB
iter=720  RSS=102.2 MB
iter=740  RSS=102.6 MB
iter=760  RSS=102.9 MB
iter=780  RSS=103.3 MB
iter=800  RSS=103.9 MB
iter=820  RSS=104.3 MB
iter=840  RSS=104.7 MB
iter=860  RSS=105.1 MB
iter=880  RSS=105.4 MB
iter=900  RSS=105.8 MB
iter=920  RSS=106.1 MB
iter=940  RSS=104.9 MB
iter=960  RSS=105.8 MB
iter=980  RSS=106.0 MB
Leak results in #2860
iter=0  RSS=71.9 MB
iter=20  RSS=72.0 MB
iter=40  RSS=72.1 MB
iter=60  RSS=72.1 MB
iter=80  RSS=72.1 MB
iter=100  RSS=72.1 MB
iter=120  RSS=72.1 MB
iter=140  RSS=72.1 MB
iter=160  RSS=72.1 MB
iter=180  RSS=72.1 MB
iter=200  RSS=72.1 MB
iter=220  RSS=72.1 MB
iter=240  RSS=72.1 MB
iter=260  RSS=71.5 MB
iter=280  RSS=70.9 MB
iter=300  RSS=70.9 MB
iter=320  RSS=70.9 MB
iter=340  RSS=70.9 MB
iter=360  RSS=70.9 MB
iter=380  RSS=70.9 MB
iter=400  RSS=70.9 MB
iter=420  RSS=70.9 MB
iter=440  RSS=70.9 MB
iter=460  RSS=70.9 MB
iter=480  RSS=70.9 MB
iter=500  RSS=70.9 MB
iter=520  RSS=70.9 MB
iter=540  RSS=70.9 MB
iter=560  RSS=70.9 MB
iter=580  RSS=70.9 MB
iter=600  RSS=70.9 MB
iter=620  RSS=68.4 MB
iter=640  RSS=68.2 MB
iter=660  RSS=68.2 MB
iter=680  RSS=68.2 MB
iter=700  RSS=68.2 MB
iter=720  RSS=68.2 MB
iter=740  RSS=68.2 MB
iter=760  RSS=68.2 MB
iter=780  RSS=68.0 MB
iter=800  RSS=68.0 MB
iter=820  RSS=68.0 MB
iter=840  RSS=68.0 MB
iter=860  RSS=68.0 MB
iter=880  RSS=68.0 MB
iter=900  RSS=68.0 MB
iter=920  RSS=68.0 MB
iter=940  RSS=68.0 MB
iter=960  RSS=68.0 MB
iter=980  RSS=68.0 MB
Leak results in #2148
iter=0  RSS=67.8 MB
iter=20  RSS=71.2 MB
iter=40  RSS=74.4 MB
iter=60  RSS=77.7 MB
iter=80  RSS=81.0 MB
iter=100  RSS=84.2 MB
iter=120  RSS=87.4 MB
iter=140  RSS=89.3 MB
iter=160  RSS=90.1 MB
iter=180  RSS=90.8 MB
iter=200  RSS=91.2 MB
iter=220  RSS=91.5 MB
iter=240  RSS=92.1 MB
iter=260  RSS=92.9 MB
iter=280  RSS=93.4 MB
iter=300  RSS=93.8 MB
iter=320  RSS=90.5 MB
iter=340  RSS=90.9 MB
iter=360  RSS=91.6 MB
iter=380  RSS=92.0 MB
iter=400  RSS=92.7 MB
iter=420  RSS=93.2 MB
iter=440  RSS=93.7 MB
iter=460  RSS=94.1 MB
iter=480  RSS=94.6 MB
iter=500  RSS=94.9 MB
iter=520  RSS=95.3 MB
iter=540  RSS=95.6 MB
iter=560  RSS=96.0 MB
iter=580  RSS=96.5 MB
iter=600  RSS=97.0 MB
iter=620  RSS=99.9 MB
iter=640  RSS=100.3 MB
iter=660  RSS=100.7 MB
iter=680  RSS=101.6 MB
iter=700  RSS=102.2 MB
iter=720  RSS=102.6 MB
iter=740  RSS=102.9 MB
iter=760  RSS=103.4 MB
iter=780  RSS=103.9 MB
iter=800  RSS=104.8 MB
iter=820  RSS=105.4 MB
iter=840  RSS=105.7 MB
iter=860  RSS=106.0 MB
iter=880  RSS=106.4 MB
iter=900  RSS=106.8 MB
iter=920  RSS=107.3 MB
iter=940  RSS=108.0 MB
iter=960  RSS=108.2 MB
iter=980  RSS=108.6 MB
Running tests locally

To run the test locally, use the following PEP 723 compatible script:
Note that you should set the appropriate branch for the openai-python dependency in the script.

For the main branch, use: openai @ git+https://github.com/openai/openai-python.git@main,

# /// script
# requires-python = ">=3.8"
# dependencies = [
#   "hishel[httpx]",
#   "openai @ git+https://github.com/openai/openai-python.git@refs/pull/2860/head",
#   "psutil",
#   "pydantic>=2.0.0",
# ]
# ///

import gc
import os
import asyncio
from typing import List

import hishel
import psutil
from pydantic import Field, create_model
from hishel.httpx import AsyncCacheClient

from openai import AsyncOpenAI

proc = psutil.Process(os.getpid())

StepModel = create_model(
    "Step",
    explanation=(str, Field()),
    output=(str, Field()),
)


def create_new_model():
    return create_model(
        "MathResponse",
        steps=(List[StepModel], Field()),
        final_answer=(str, Field()),
    )


async def main():
    client = AsyncOpenAI(http_client=AsyncCacheClient(policy=hishel.FilterPolicy()))
    for i in range(1000):
        model = create_new_model()
        await client.beta.chat.completions.parse(
            model="gpt-4o-2024-08-06",
            messages=[
                {"role": "system", "content": "You are a helpful math tutor."},
                {"role": "user", "content": "solve 8x + 31 = 2"},
            ],
            response_format=model,
        )
        gc.collect()
        if i % 20 == 0:
            print(f"iter={i}  RSS={proc.memory_info().rss / 1024**2:.1f} MB")


asyncio.run(main())

@karpetrosyan karpetrosyan marked this pull request as ready for review February 11, 2026 21:31
@karpetrosyan karpetrosyan requested a review from a team as a code owner February 11, 2026 21:31
@karpetrosyan
Copy link
Collaborator Author

memory leak tests are a bit flaky, tbh, that’s why I didn’t add it

Copy link
Collaborator

@RobertCraigie RobertCraigie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does pydantic know what class to instantiate now that we aren't actually passing it?

def clear_locals(string: str, *, stacklevel: int) -> str:
caller = get_caller_name(stacklevel=stacklevel + 1)
return string.replace(f"{caller}.<locals>.", "")
return re.sub(r"([A-Za-z_]\w*)\[[^\[\]]+\](?=\()", r"\1", string)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: what does this do? can you add a comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh right, I'll add a comment. It's stripping out the generic name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a comment, also I've fixed the same leak for streaming too

@karpetrosyan
Copy link
Collaborator Author

karpetrosyan commented Feb 12, 2026

How does pydantic know what class to instantiate now that we aren't actually passing it?

We’re already instantiating the appropriate class here

@RobertCraigie
Copy link
Collaborator

ahhh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants