history tool gives models long-term memory that persists across sessions.
When enabled, every request-response pair is automatically stored and indexed.
The model can then semantically search past interactions to recall information from earlier conversations.
How it works
- Add
{ type: "history" }to thetoolsarray in your request - The model receives a
historyfunction it can call with a search query - Past conversations are searched using vector embeddings for semantic similarity
- After each response completes, the user message and assistant reply are stored automatically in the background
X-On-Behalf-Of user ID gets an isolated history store. Different end-users cannot see each other’s history.
Enable the history tool
store: true so the conversation is persisted and available for future recall.
Recall information from a previous session
In a later request — even minutes, hours, or days later — the model can search its history to find relevant past interactions. The model decides when to call the history tool based on the user’s question.Full example: store and retrieve across sessions
This example shows the complete flow — storing information in one request and retrieving it in a separate request.Behavior details
| Aspect | Detail |
|---|---|
| Storage | Automatic — each request/response pair is indexed after the response completes |
| Search | Semantic — uses vector embeddings, not keyword matching |
| Scope | Per end-user — isolated by X-On-Behalf-Of header |
| Indexing | Background — does not add latency to the response |
| Results | Up to 10 most relevant past interactions returned per search |
| Entry size | Text truncated to 7,500 characters per entry for embedding |
When to use the history tool
- Personalization: Remember user preferences, names, or context across sessions
- Project continuity: Recall decisions, deadlines, or requirements discussed earlier
- Support workflows: Maintain context about a user’s issue history
- Assistants: Build assistants that learn and adapt to individual users over time
Next steps
- Conversations — manage multi-turn exchanges within a single session
- Files and vector stores — store and search documents
- Generate a response — text requests and multi-turn exchanges