function_call_output, and the parent continues generation.
A child response is sometimes called a subagent.
In this guide, it is just another Responses request that performs a focused task.
Parent response
-> function_call spawn_subagent
-> your app runs one or more child responses
-> child results returned as function_call_output
-> parent response resumes generation
How the loop works
In this pattern the parent response pauses whenever it callsspawn_subagent.
Your app executes the delegated tasks and then resumes the parent with the results.
- Create a parent response with a
spawn_subagentfunction tool. - When the model calls the tool, parse each tool call’s arguments.
- Run one or more child Responses requests to perform the delegated tasks.
- Wait for all child responses to finish.
- Return each child result as
function_call_outputusing the matchingcall_id. - Resume the parent response with
previous_response_id. - Repeat until the parent produces a normal assistant
message.
Define a tool for delegation
Keep the tool focused. Pass only the fields the child response needs.tool_choice: "auto" when the model should decide when to delegate.
Use tool_choice: "required" when every turn must go through a tool.
TypeScript SDK recipe
This example usessdk.llm.responses.create for both the parent and child responses.
It allows the parent to emit multiple spawn_subagent calls in one turn.
Your app runs those child responses concurrently, waits for all of them to finish, and then resumes the parent once with every tool result.
Keep each child result small so the parent can use it in the next turn without consuming too much context.
Fan out and wait for all child responses
Whenparallel_tool_calls is true, the parent can emit several function_call items in one turn.
Treat that set of tool calls as a batch.
Start every child response, wait for all of them to finish, and only then resume the parent.
This creates a barrier:
- Parent response emits many
function_callitems. - Your app starts many child responses.
- Your app waits for all child responses to complete.
- Your app sends all
function_call_outputitems back in one follow-up request. - Parent response continues with the full set of delegated results.
X-On-Behalf-Of value on the parent and child requests.
If a child response may take longer, you can set background: true on the child request and poll with sdk.llm.responses.get until it completes.
When you fan out to multiple child responses, wait until every child result is available before you resume the parent response.
Raw Responses request for the resume step
The critical handoff is the follow-up request. You pass every tool result back ininput and point to the earlier parent turn with previous_response_id.
bash
X-On-Behalf-Of.
Basic error handling and recovery
In real systems, this loop will fail sometimes. Common failures include malformed tool arguments, child request timeouts, upstream 5xx errors, rate limits, and child outputs that are too weak to be useful. The safest pattern is:- Parse tool arguments defensively.
- Retry transient child request failures a small number of times with backoff.
- Return a structured failure payload to the parent instead of crashing the whole batch when one child fails.
- Keep child results small and explicit so the parent can decide whether to continue, retry, or answer with partial results.
background: true plus polling.
The recovery logic stays the same: wait for the child to reach a terminal state, then return either a success payload or a structured failure payload to the parent.
Practical limits
- Keep
parallel_tool_callsset totruewhen the parent should be able to delegate several child responses in one turn. - Set
parallel_tool_callstofalseonly when child responses share state or must run in order. - Set
max_tool_callsso a parent response cannot loop forever. - Keep child outputs compact so the parent can incorporate the result without consuming too much context.
- Keep
store: truewhile you build the workflow so you can inspect parent and child responses later. - Use the Responses input items page in the API Reference when you need to debug the exact items sent back to the model.