> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mka1.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Long-term memory

> Use the history tool to give models persistent memory across sessions, enabling recall of past conversations per end-user.

The `history` tool gives models long-term memory that persists across sessions.
When enabled, every request-response pair is automatically stored and indexed.
The model can then semantically search past interactions to recall information from earlier conversations.

## How it works

1. Add `{ type: "history" }` to the `tools` array in your request
2. The model receives a `history` function it can call with a search query
3. Past conversations are searched using vector embeddings for semantic similarity
4. After each response completes, the user message and assistant reply are stored automatically in the background

Memory is **scoped per end-user** — each `X-On-Behalf-Of` user ID gets an isolated history store. Different end-users cannot see each other's history.

## Enable the history tool

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create \
    --model auto \
    --body '{
      "input": "Remember this: my favorite color is blue.",
      "tools": [{ "type": "history" }],
      "store": true
    }' \
    -H 'X-On-Behalf-Of: <end-user-id>'
  ```

  ```ts MKA1 SDK theme={null}
  import { SDK } from '@meetkai/mka1';

  const mka1 = new SDK({
    bearerAuth: `Bearer ${YOUR_API_KEY}`,
  });

  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: 'Remember this: my favorite color is blue.',
    tools: [{ type: 'history' }],
    store: true,
  }, { headers: { 'X-On-Behalf-Of': '<end-user-id>' } });
  ```

  ```ts OpenAI SDK theme={null}
  import OpenAI from 'openai';

  const openai = new OpenAI({
    apiKey: '<mka1-api-key>',
    baseURL: 'https://apigw.mka1.com/api/v1/llm/',
    defaultHeaders: { 'X-On-Behalf-Of': '<end-user-id>' },
  });

  const response = await openai.responses.create({
    model: 'auto',
    input: 'Remember this: my favorite color is blue.',
    tools: [{ type: 'history' }],
    store: true,
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: $"Bearer {YOUR_API_KEY}",
      serverUrl: "https://apigw.mka1.com"
  );

  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateStr(
          "Remember this: my favorite color is blue."),
      Tools = new List<ResponsesCreateRequestTool>
      {
          ResponsesCreateRequestTool.CreateHistoryToolDefinition(new HistoryToolDefinition()),
      },
      Store = true,
  });
  ```

  ```python Python SDK theme={null}
  from mka1 import SDK

  sdk = SDK(bearer_auth="Bearer YOUR_API_KEY")

  result = sdk.llm.responses.create(
      model="auto",
      input="Remember this: my favorite color is blue.",
      tools=[{"type": "history"}],
      store=True,
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "model": "auto",
      "input": "Remember this: my favorite color is blue.",
      "tools": [{ "type": "history" }],
      "store": true
    }'
  ```
</CodeGroup>

Set `store: true` so the conversation is persisted and available for future recall.

## Recall information from a previous session

In a later request — even minutes, hours, or days later — the model can search its history to find relevant past interactions. The model decides when to call the history tool based on the user's question.

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create \
    --model auto \
    --body '{
      "input": "What is my favorite color?",
      "tools": [{ "type": "history" }],
      "store": true
    }'
  ```

  ```ts MKA1 SDK theme={null}
  // In a new session, the model can recall past conversations
  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: 'What is my favorite color?',
    tools: [{ type: 'history' }],
    store: true,
  }, { headers: { 'X-On-Behalf-Of': '<end-user-id>' } });

  // The model calls the history tool, finds the earlier conversation,
  // and responds: "Your favorite color is blue."
  ```

  ```ts OpenAI SDK theme={null}
  const response = await openai.responses.create({
    model: 'auto',
    input: 'What is my favorite color?',
    tools: [{ type: 'history' }],
    store: true,
    stream: false,
  });

  // The model calls the history tool, finds the earlier conversation,
  // and responds: "Your favorite color is blue."
  ```

  ```csharp C# SDK theme={null}
  // In a new session, the model can recall past conversations
  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateStr("What is my favorite color?"),
      Tools = new List<ResponsesCreateRequestTool>
      {
          ResponsesCreateRequestTool.CreateHistoryToolDefinition(new HistoryToolDefinition()),
      },
      Store = true,
  });

  // The model calls the history tool, finds the earlier conversation,
  // and responds: "Your favorite color is blue."
  ```

  ```python Python SDK theme={null}
  # In a new session, the model can recall past conversations
  result = sdk.llm.responses.create(
      model="auto",
      input="What is my favorite color?",
      tools=[{"type": "history"}],
      store=True,
  )

  # The model calls the history tool, finds the earlier conversation,
  # and responds: "Your favorite color is blue."
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "model": "auto",
      "input": "What is my favorite color?",
      "tools": [{ "type": "history" }],
      "store": true
    }'
  ```
</CodeGroup>

## Full example: store and retrieve across sessions

This example shows the complete flow — storing information in one request and retrieving it in a separate request.

<CodeGroup>
  ```bash CLI theme={null}
  # Session 1: Tell the model something to remember
  mka1 llm responses create \
    --model auto \
    --body '{
      "input": "Remember this: the project deadline is March 15th and the budget is $50,000.",
      "tools": [{ "type": "history" }],
      "store": true
    }'

  # Session 2: Ask about it later
  mka1 llm responses create \
    --model auto \
    --body '{
      "input": "What is the project deadline and budget?",
      "tools": [{ "type": "history" }],
      "store": true
    }'
  ```

  ```ts MKA1 SDK theme={null}
  import { SDK } from '@meetkai/mka1';

  const mka1 = new SDK({
    bearerAuth: `Bearer ${YOUR_API_KEY}`,
  });

  const headers = { 'X-On-Behalf-Of': 'user-123' };

  // Session 1: Tell the model something to remember
  const first = await mka1.llm.responses.create({
    model: 'auto',
    input: 'Remember this: the project deadline is March 15th and the budget is $50,000.',
    tools: [{ type: 'history' }],
    store: true,
  }, { headers });

  console.log('Stored:', first.outputText);

  // Session 2: Ask about it later
  const second = await mka1.llm.responses.create({
    model: 'auto',
    input: 'What is the project deadline and budget?',
    tools: [{ type: 'history' }],
    store: true,
  }, { headers });

  console.log('Recalled:', second.outputText);
  // → "The project deadline is March 15th and the budget is $50,000."
  ```

  ```ts OpenAI SDK theme={null}
  import OpenAI from 'openai';

  const openai = new OpenAI({
    apiKey: '<mka1-api-key>',
    baseURL: 'https://apigw.mka1.com/api/v1/llm/',
    defaultHeaders: { 'X-On-Behalf-Of': 'user-123' },
  });

  // Session 1: Tell the model something to remember
  const first = await openai.responses.create({
    model: 'auto',
    input: 'Remember this: the project deadline is March 15th and the budget is $50,000.',
    tools: [{ type: 'history' }],
    store: true,
    stream: false,
  });

  console.log('Stored:', first.output_text);

  // Session 2: Ask about it later
  const second = await openai.responses.create({
    model: 'auto',
    input: 'What is the project deadline and budget?',
    tools: [{ type: 'history' }],
    store: true,
    stream: false,
  });

  console.log('Recalled:', second.output_text);
  // → "The project deadline is March 15th and the budget is $50,000."
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: $"Bearer {YOUR_API_KEY}",
      serverUrl: "https://apigw.mka1.com"
  );

  var historyTools = new List<ResponsesCreateRequestTool>
  {
      ResponsesCreateRequestTool.CreateHistoryToolDefinition(new HistoryToolDefinition()),
  };

  // Session 1: Tell the model something to remember
  var first = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateStr(
          "Remember this: the project deadline is March 15th and the budget is $50,000."),
      Tools = historyTools,
      Store = true,
  });

  Console.WriteLine($"Stored: {first.OutputText}");

  // Session 2: Ask about it later
  var second = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateStr(
          "What is the project deadline and budget?"),
      Tools = historyTools,
      Store = true,
  });

  Console.WriteLine($"Recalled: {second.OutputText}");
  // -> "The project deadline is March 15th and the budget is $50,000."
  ```

  ```python Python SDK theme={null}
  from mka1 import SDK

  sdk = SDK(bearer_auth="Bearer YOUR_API_KEY")

  # Session 1: Tell the model something to remember
  first = sdk.llm.responses.create(
      model="auto",
      input="Remember this: the project deadline is March 15th and the budget is $50,000.",
      tools=[{"type": "history"}],
      store=True,
  )

  print("Stored:", first.output_text)

  # Session 2: Ask about it later
  second = sdk.llm.responses.create(
      model="auto",
      input="What is the project deadline and budget?",
      tools=[{"type": "history"}],
      store=True,
  )

  print("Recalled:", second.output_text)
  # -> "The project deadline is March 15th and the budget is $50,000."
  ```

  ```bash bash theme={null}
  # Session 1: Store information
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: user-123' \
    --data '{
      "model": "auto",
      "input": "Remember this: the project deadline is March 15th and the budget is $50,000.",
      "tools": [{ "type": "history" }],
      "store": true
    }'

  # Session 2: Retrieve it later
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: user-123' \
    --data '{
      "model": "auto",
      "input": "What is the project deadline and budget?",
      "tools": [{ "type": "history" }],
      "store": true
    }'
  ```
</CodeGroup>

## Behavior details

| Aspect     | Detail                                                                         |
| ---------- | ------------------------------------------------------------------------------ |
| Storage    | Automatic — each request/response pair is indexed after the response completes |
| Search     | Semantic — uses vector embeddings, not keyword matching                        |
| Scope      | Per end-user — isolated by `X-On-Behalf-Of` header                             |
| Indexing   | Background — does not add latency to the response                              |
| Results    | Up to 10 most relevant past interactions returned per search                   |
| Entry size | Text truncated to 7,500 characters per entry for embedding                     |

## When to use the history tool

* **Personalization**: Remember user preferences, names, or context across sessions
* **Project continuity**: Recall decisions, deadlines, or requirements discussed earlier
* **Support workflows**: Maintain context about a user's issue history
* **Assistants**: Build assistants that learn and adapt to individual users over time

## Next steps

* [Conversations](/docs/conversations) — manage multi-turn exchanges within a single session
* [Files and vector stores](/docs/files-and-vector-stores) — store and search documents
* [Generate a response](/docs/generate-a-response) — text requests and multi-turn exchanges