> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mka1.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Multimodal input

> Send images, audio, documents, and mixed content to the MKA1 API for vision, transcription, OCR, and multimodal reasoning.

The Responses API accepts text, images, audio, and files in a single request.
Use structured `input` with content arrays to combine modalities.

## Supported input types

| Type     | Content type  | Formats                              | Delivery                           |
| -------- | ------------- | ------------------------------------ | ---------------------------------- |
| Text     | `input_text`  | Plain text                           | Inline                             |
| Image    | `input_image` | JPEG, PNG, WebP, GIF, TIFF           | URL, base64 data URI, or `file_id` |
| Audio    | `input_audio` | WAV, MP3                             | Base64                             |
| Document | `input_file`  | PDF, DOCX, XLSX, PPTX, RTF, TXT, CSV | URL, base64 data URI, or `file_id` |
| Video    | `input_file`  | MP4                                  | Base64 data URI or `file_id`       |

## Image input

Send an image for the model to describe, analyze, or answer questions about.
Provide the image as a URL, a base64 data URI, or a previously uploaded `file_id`.

### Image via URL

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create \
    -H 'X-On-Behalf-Of: <end-user-id>' \
    --body '{
      "model": "auto",
      "input": [
        {
          "type": "message",
          "role": "user",
          "content": [
            { "type": "input_text", "text": "Describe what you see in this image." },
            {
              "type": "input_image",
              "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"
            }
          ]
        }
      ]
    }'
  ```

  ```ts MKA1 SDK theme={null}
  import { SDK } from '@meetkai/mka1';

  const mka1 = new SDK({
    bearerAuth: `Bearer ${YOUR_API_KEY}`,
  });

  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Describe what you see in this image.' },
          {
            type: 'input_image',
            image_url: 'https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg',
          },
        ],
      },
    ],
  }, { headers: { 'X-On-Behalf-Of': '<end-user-id>' } });
  ```

  ```ts OpenAI SDK theme={null}
  import OpenAI from 'openai';

  const openai = new OpenAI({
    apiKey: '<mka1-api-key>',
    baseURL: 'https://apigw.mka1.com/api/v1/llm/',
    defaultHeaders: { 'X-On-Behalf-Of': '<end-user-id>' },
  });

  const response = await openai.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Describe what you see in this image.' },
          {
            type: 'input_image',
            image_url: 'https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg',
          },
        ],
      },
    ],
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: $"Bearer {YOUR_API_KEY}",
      serverUrl: "https://apigw.mka1.com"
  );

  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateArrayOfItem(new List<Item>
      {
          Item.CreateInputMessage(new InputMessage()
          {
              Role = InputMessageRole.User,
              Content = InputMessageContent1.CreateArrayOfInputMessageContent(
                  new List<InputMessageContent>
                  {
                      InputMessageContent.CreateInputText(new InputText()
                      {
                          Text = "Describe what you see in this image.",
                      }),
                      InputMessageContent.CreateInputImage(new InputImage()
                      {
                          ImageUrl = "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg",
                      }),
                  }),
          }),
      }),
  });
  ```

  ```python Python SDK theme={null}
  from mka1 import SDK

  sdk = SDK(bearer_auth="Bearer YOUR_API_KEY")

  result = sdk.llm.responses.create(
      model="auto",
      input=[{
          "type": "message",
          "role": "user",
          "content": [
              {"type": "input_text", "text": "Describe what you see in this image."},
              {
                  "type": "input_image",
                  "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg",
              },
          ],
      }],
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "model": "auto",
      "input": [
        {
          "type": "message",
          "role": "user",
          "content": [
            { "type": "input_text", "text": "Describe what you see in this image." },
            {
              "type": "input_image",
              "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg",
            }
          ]
        }
      ]
    }'
  ```
</CodeGroup>

### Image via base64

Encode the image as a data URI with the appropriate MIME type.

<CodeGroup>
  ```bash CLI theme={null}
  IMAGE_B64=$(base64 -i photo.jpg)

  mka1 llm responses create \
    --body "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            { \"type\": \"input_text\", \"text\": \"What is in this photo?\" },
            {
              \"type\": \"input_image\",
              \"image_url\": \"data:image/jpeg;base64,${IMAGE_B64}\"
            }
          ]
        }
      ]
    }"
  ```

  ```ts MKA1 SDK theme={null}
  import { readFileSync } from 'fs';

  const imageBase64 = readFileSync('photo.jpg').toString('base64');

  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'What is in this photo?' },
          {
            type: 'input_image',
            image_url: `data:image/jpeg;base64,${imageBase64}`,
          },
        ],
      },
    ],
  });
  ```

  ```ts OpenAI SDK theme={null}
  import { readFileSync } from 'fs';

  const imageBase64 = readFileSync('photo.jpg').toString('base64');

  const response = await openai.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'What is in this photo?' },
          {
            type: 'input_image',
            image_url: `data:image/jpeg;base64,${imageBase64}`,
          },
        ],
      },
    ],
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  var imageBytes = System.IO.File.ReadAllBytes("photo.jpg");
  var imageBase64 = Convert.ToBase64String(imageBytes);

  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateArrayOfItem(new List<Item>
      {
          Item.CreateInputMessage(new InputMessage()
          {
              Role = InputMessageRole.User,
              Content = InputMessageContent1.CreateArrayOfInputMessageContent(
                  new List<InputMessageContent>
                  {
                      InputMessageContent.CreateInputText(new InputText()
                      {
                          Text = "What is in this photo?",
                      }),
                      InputMessageContent.CreateInputImage(new InputImage()
                      {
                          ImageUrl = $"data:image/jpeg;base64,{imageBase64}",
                      }),
                  }),
          }),
      }),
  });
  ```

  ```python Python SDK theme={null}
  import base64

  with open("photo.jpg", "rb") as f:
      image_base64 = base64.b64encode(f.read()).decode()

  result = sdk.llm.responses.create(
      model="auto",
      input=[{
          "type": "message",
          "role": "user",
          "content": [
              {"type": "input_text", "text": "What is in this photo?"},
              {
                  "type": "input_image",
                  "image_url": f"data:image/jpeg;base64,{image_base64}",
              },
          ],
      }],
  )
  ```

  ```bash bash theme={null}
  # Encode a local image and send it inline
  IMAGE_B64=$(base64 -i photo.jpg)

  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --data "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            { \"type\": \"input_text\", \"text\": \"What is in this photo?\" },
            {
              \"type\": \"input_image\",
              \"image_url\": \"data:image/jpeg;base64,${IMAGE_B64}\",
            }
          ]
        }
      ]
    }"
  ```
</CodeGroup>

### Image via file\_id

Upload an image with the Files API first, then reference it by ID.

<CodeGroup>
  ```bash CLI theme={null}
  # Upload the image
  FILE_ID=$(mka1 llm files upload \
    --file @photo.jpg \
    --purpose assistants | jq -r '.id')

  # Use the file_id
  mka1 llm responses create \
    --body "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            { \"type\": \"input_text\", \"text\": \"Describe this image.\" },
            { \"type\": \"input_image\", \"file_id\": \"${FILE_ID}\" }
          ]
        }
      ]
    }"
  ```

  ```ts MKA1 SDK theme={null}
  // Upload the image
  const uploadResult = await mka1.llm.files.create({
    file: { fileName: 'photo.jpg', content: imageBuffer },
    purpose: 'assistants',
  });

  // Use the file_id in a response
  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Describe this image.' },
          { type: 'input_image', file_id: uploadResult.id },
        ],
      },
    ],
  });
  ```

  ```ts OpenAI SDK theme={null}
  // Upload the image
  const file = await openai.files.create({
    file: new File([imageBuffer], 'photo.jpg', { type: 'image/jpeg' }),
    purpose: 'assistants',
  });

  // Use the file_id in a response
  const response = await openai.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Describe this image.' },
          { type: 'input_image', file_id: file.id },
        ],
      },
    ],
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1.Types.Requests;

  // Upload the image
  var uploadResult = await sdk.Llm.Files.UploadAsync(new UploadFileRequestBody()
  {
      File = new UploadFileFile()
      {
          FileName = "photo.png",
          Content = System.IO.File.ReadAllBytes("photo.png"),
      },
      Purpose = UploadFilePurpose.Assistants,
  });

  // Use the file_id in a response
  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateArrayOfItem(new List<Item>
      {
          Item.CreateInputMessage(new InputMessage()
          {
              Role = InputMessageRole.User,
              Content = InputMessageContent1.CreateArrayOfInputMessageContent(
                  new List<InputMessageContent>
                  {
                      InputMessageContent.CreateInputText(new InputText()
                      {
                          Text = "Describe this image.",
                      }),
                      InputMessageContent.CreateInputImage(new InputImage()
                      {
                          FileId = uploadResult.File!.Id,
                      }),
                  }),
          }),
      }),
  });
  ```

  ```python Python SDK theme={null}
  # Upload the image
  upload_result = sdk.llm.files.upload(
      file={"file_name": "photo.jpg", "content": open("photo.jpg", "rb")},
      purpose="assistants",
  )

  # Use the file_id in a response
  result = sdk.llm.responses.create(
      model="auto",
      input=[{
          "type": "message",
          "role": "user",
          "content": [
              {"type": "input_text", "text": "Describe this image."},
              {"type": "input_image", "file_id": upload_result.id},
          ],
      }],
  )
  ```

  ```bash bash theme={null}
  # Upload the image
  FILE_ID=$(curl -s https://apigw.mka1.com/api/v1/llm/files \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --form file=@photo.jpg \
    --form purpose=assistants | jq -r '.id')

  # Use the file_id
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --data "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            { \"type\": \"input_text\", \"text\": \"Describe this image.\" },
            { \"type\": \"input_image\", \"file_id\": \"${FILE_ID}\" }
          ]
        }
      ]
    }"
  ```
</CodeGroup>

## Audio input

Send audio for the model to process. The audio is automatically transcribed and the model responds to the spoken content.

Supported formats: **WAV** and **MP3** (max 25 MB).

<CodeGroup>
  ```bash CLI theme={null}
  AUDIO_B64=$(base64 -i recording.wav)

  mka1 llm responses create \
    --body "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            {
              \"type\": \"input_audio\",
              \"input_audio\": {
                \"data\": \"${AUDIO_B64}\",
                \"format\": \"wav\"
              }
            }
          ]
        }
      ]
    }"
  ```

  ```ts MKA1 SDK theme={null}
  import { readFileSync } from 'fs';

  const audioBase64 = readFileSync('recording.wav').toString('base64');

  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          {
            type: 'input_audio',
            input_audio: {
              data: audioBase64,
              format: 'wav',
            },
          },
        ],
      },
    ],
  }, { headers: { 'X-On-Behalf-Of': '<end-user-id>' } });
  ```

  ```ts OpenAI SDK theme={null}
  import { readFileSync } from 'fs';

  const audioBase64 = readFileSync('recording.wav').toString('base64');

  const response = await openai.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          {
            type: 'input_audio',
            input_audio: {
              data: audioBase64,
              format: 'wav',
            },
          },
        ],
      },
    ],
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  var audioBytes = System.IO.File.ReadAllBytes("recording.wav");
  var audioBase64 = Convert.ToBase64String(audioBytes);

  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateArrayOfItem(new List<Item>
      {
          Item.CreateInputMessage(new InputMessage()
          {
              Role = InputMessageRole.User,
              Content = InputMessageContent1.CreateArrayOfInputMessageContent(
                  new List<InputMessageContent>
                  {
                      InputMessageContent.CreateInputAudio(new InputAudio()
                      {
                          InputAudioValue = new InputAudioInputAudio()
                          {
                              Data = audioBase64,
                              Format = InputAudioFormat.Wav,
                          },
                      }),
                  }),
          }),
      }),
  });
  ```

  ```python Python SDK theme={null}
  import base64

  with open("recording.wav", "rb") as f:
      audio_base64 = base64.b64encode(f.read()).decode()

  result = sdk.llm.responses.create(
      model="auto",
      input=[{
          "type": "message",
          "role": "user",
          "content": [
              {
                  "type": "input_audio",
                  "input_audio": {
                      "data": audio_base64,
                      "format": "wav",
                  },
              },
          ],
      }],
  )
  ```

  ```bash bash theme={null}
  AUDIO_B64=$(base64 -i recording.wav)

  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            {
              \"type\": \"input_audio\",
              \"input_audio\": {
                \"data\": \"${AUDIO_B64}\",
                \"format\": \"wav\"
              }
            }
          ]
        }
      ]
    }"
  ```
</CodeGroup>

The model automatically transcribes the audio and responds to the spoken content. For example, sending a WAV file containing "Hello, how are you today?" returns:

```json theme={null}
{
  "status": "completed",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Hello! I'm doing well, thank you for asking. I'm here and ready to help you with any questions or tasks you might have. How can I assist you today?"
        }
      ]
    }
  ]
}
```

## Document input

Send documents for the model to read and reason over.
PDF and scanned documents are automatically processed with OCR — no extra configuration needed.

### Document via URL

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create \
    --body '{
      "model": "auto",
      "input": [
        {
          "type": "message",
          "role": "user",
          "content": [
            { "type": "input_text", "text": "Summarize this document in three bullet points." },
            {
              "type": "input_file",
              "file_url": "https://example.com/report.pdf",
              "filename": "report.pdf"
            }
          ]
        }
      ]
    }'
  ```

  ```ts MKA1 SDK theme={null}
  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Summarize this document in three bullet points.' },
          {
            type: 'input_file',
            file_url: 'https://example.com/report.pdf',
            filename: 'report.pdf',
          },
        ],
      },
    ],
  });
  ```

  ```ts OpenAI SDK theme={null}
  const response = await openai.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Summarize this document in three bullet points.' },
          {
            type: 'input_file',
            file_url: 'https://example.com/report.pdf',
            filename: 'report.pdf',
          },
        ],
      },
    ],
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateArrayOfItem(new List<Item>
      {
          Item.CreateInputMessage(new InputMessage()
          {
              Role = InputMessageRole.User,
              Content = InputMessageContent1.CreateArrayOfInputMessageContent(
                  new List<InputMessageContent>
                  {
                      InputMessageContent.CreateInputText(new InputText()
                      {
                          Text = "Summarize this document in three bullet points.",
                      }),
                      InputMessageContent.CreateInputFile(new InputFile()
                      {
                          FileUrl = "https://example.com/report.pdf",
                          Filename = "report.pdf",
                      }),
                  }),
          }),
      }),
  });
  ```

  ```python Python SDK theme={null}
  result = sdk.llm.responses.create(
      model="auto",
      input=[{
          "type": "message",
          "role": "user",
          "content": [
              {"type": "input_text", "text": "Summarize this document in three bullet points."},
              {
                  "type": "input_file",
                  "file_url": "https://example.com/report.pdf",
                  "filename": "report.pdf",
              },
          ],
      }],
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "model": "auto",
      "input": [
        {
          "type": "message",
          "role": "user",
          "content": [
            { "type": "input_text", "text": "Summarize this document in three bullet points." },
            {
              "type": "input_file",
              "file_url": "https://example.com/report.pdf",
              "filename": "report.pdf"
            }
          ]
        }
      ]
    }'
  ```
</CodeGroup>

### Document via base64

Encode the file as a data URI. Include the MIME type so the API can route it to the correct processor.

<CodeGroup>
  ```bash CLI theme={null}
  PDF_B64=$(base64 -i contract.pdf)

  mka1 llm responses create \
    --body "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            { \"type\": \"input_text\", \"text\": \"What are the key terms in this contract?\" },
            {
              \"type\": \"input_file\",
              \"file_data\": \"data:application/pdf;base64,${PDF_B64}\",
              \"filename\": \"contract.pdf\"
            }
          ]
        }
      ]
    }"
  ```

  ```ts MKA1 SDK theme={null}
  import { readFileSync } from 'fs';

  const pdfBase64 = readFileSync('contract.pdf').toString('base64');

  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'What are the key terms in this contract?' },
          {
            type: 'input_file',
            file_data: `data:application/pdf;base64,${pdfBase64}`,
            filename: 'contract.pdf',
          },
        ],
      },
    ],
  });
  ```

  ```ts OpenAI SDK theme={null}
  import { readFileSync } from 'fs';

  const pdfBase64 = readFileSync('contract.pdf').toString('base64');

  const response = await openai.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'What are the key terms in this contract?' },
          {
            type: 'input_file',
            file_data: `data:application/pdf;base64,${pdfBase64}`,
            filename: 'contract.pdf',
          },
        ],
      },
    ],
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  var pdfBytes = System.IO.File.ReadAllBytes("contract.pdf");
  var pdfBase64 = Convert.ToBase64String(pdfBytes);

  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateArrayOfItem(new List<Item>
      {
          Item.CreateInputMessage(new InputMessage()
          {
              Role = InputMessageRole.User,
              Content = InputMessageContent1.CreateArrayOfInputMessageContent(
                  new List<InputMessageContent>
                  {
                      InputMessageContent.CreateInputText(new InputText()
                      {
                          Text = "What are the key terms in this contract?",
                      }),
                      InputMessageContent.CreateInputFile(new InputFile()
                      {
                          FileData = $"data:application/pdf;base64,{pdfBase64}",
                          Filename = "contract.pdf",
                      }),
                  }),
          }),
      }),
  });
  ```

  ```python Python SDK theme={null}
  import base64

  with open("contract.pdf", "rb") as f:
      pdf_base64 = base64.b64encode(f.read()).decode()

  result = sdk.llm.responses.create(
      model="auto",
      input=[{
          "type": "message",
          "role": "user",
          "content": [
              {"type": "input_text", "text": "What are the key terms in this contract?"},
              {
                  "type": "input_file",
                  "file_data": f"data:application/pdf;base64,{pdf_base64}",
                  "filename": "contract.pdf",
              },
          ],
      }],
  )
  ```

  ```bash bash theme={null}
  PDF_B64=$(base64 -i contract.pdf)

  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --data "{
      \"model\": \"auto\",
      \"input\": [
        {
          \"type\": \"message\",
          \"role\": \"user\",
          \"content\": [
            { \"type\": \"input_text\", \"text\": \"What are the key terms in this contract?\" },
            {
              \"type\": \"input_file\",
              \"file_data\": \"data:application/pdf;base64,${PDF_B64}\",
              \"filename\": \"contract.pdf\"
            }
          ]
        }
      ]
    }"
  ```
</CodeGroup>

### Scanned documents and OCR

Scanned PDFs and images of documents are processed automatically. The API uses OCR to extract text from:

* Scanned PDF pages (converted to images at 150 DPI, then OCR'd)
* Photos of documents (JPEG, PNG, TIFF)
* Office files (DOCX, XLSX, PPTX — converted to PDF first, then OCR'd)

Multi-page documents are processed in parallel. The extracted text is returned as Markdown and passed to the model for reasoning.

No special parameters are needed — just send the file as `input_file` and the pipeline handles detection, conversion, and OCR.

### Supported document formats

| Format                         | MIME type                                                                                                    | Processing               |
| ------------------------------ | ------------------------------------------------------------------------------------------------------------ | ------------------------ |
| PDF                            | `application/pdf`                                                                                            | OCR per page at 150 DPI  |
| JPEG / PNG / TIFF / WebP / GIF | `image/*`                                                                                                    | Direct OCR               |
| Word (.doc, .docx)             | `application/msword`, `application/vnd.openxmlformats-officedocument.wordprocessingml.document`              | Convert to PDF, then OCR |
| Excel (.xls, .xlsx)            | `application/vnd.ms-excel`, `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`              | Convert to PDF, then OCR |
| PowerPoint (.ppt, .pptx)       | `application/vnd.ms-powerpoint`, `application/vnd.openxmlformats-officedocument.presentationml.presentation` | Convert to PDF, then OCR |
| RTF                            | `application/rtf`                                                                                            | Convert to PDF, then OCR |
| Plain text / CSV               | `text/plain`, `text/csv`                                                                                     | Read directly            |

**Size limit:** 30 MB per file.

## Mixed input

Combine multiple content types in a single message. The model sees all inputs together and can reason across them.

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create \
    --body '{
      "model": "auto",
      "input": [
        {
          "type": "message",
          "role": "user",
          "content": [
            { "type": "input_text", "text": "Compare the chart in the image with the data in the spreadsheet. Are the numbers consistent?" },
            {
              "type": "input_image",
              "image_url": "https://example.com/chart.png"
            },
            {
              "type": "input_file",
              "file_url": "https://example.com/data.xlsx",
              "filename": "data.xlsx"
            }
          ]
        }
      ]
    }'
  ```

  ```ts MKA1 SDK theme={null}
  const result = await mka1.llm.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Compare the chart in the image with the data in the spreadsheet. Are the numbers consistent?' },
          {
            type: 'input_image',
            image_url: 'https://example.com/chart.png',
          },
          {
            type: 'input_file',
            file_url: 'https://example.com/data.xlsx',
            filename: 'data.xlsx',
          },
        ],
      },
    ],
  });
  ```

  ```ts OpenAI SDK theme={null}
  const response = await openai.responses.create({
    model: 'auto',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          { type: 'input_text', text: 'Compare the chart in the image with the data in the spreadsheet. Are the numbers consistent?' },
          {
            type: 'input_image',
            image_url: 'https://example.com/chart.png',
          },
          {
            type: 'input_file',
            file_url: 'https://example.com/data.xlsx',
            filename: 'data.xlsx',
          },
        ],
      },
    ],
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  var res = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "auto",
      Input = ResponsesCreateRequestInput.CreateArrayOfItem(new List<Item>
      {
          Item.CreateInputMessage(new InputMessage()
          {
              Role = InputMessageRole.User,
              Content = InputMessageContent1.CreateArrayOfInputMessageContent(
                  new List<InputMessageContent>
                  {
                      InputMessageContent.CreateInputText(new InputText()
                      {
                          Text = "Compare the chart in the image with the data in the spreadsheet. Are the numbers consistent?",
                      }),
                      InputMessageContent.CreateInputImage(new InputImage()
                      {
                          ImageUrl = "https://example.com/chart.png",
                      }),
                      InputMessageContent.CreateInputFile(new InputFile()
                      {
                          FileUrl = "https://example.com/data.xlsx",
                          Filename = "data.xlsx",
                      }),
                  }),
          }),
      }),
  });
  ```

  ```python Python SDK theme={null}
  result = sdk.llm.responses.create(
      model="auto",
      input=[{
          "type": "message",
          "role": "user",
          "content": [
              {"type": "input_text", "text": "Compare the chart in the image with the data in the spreadsheet. Are the numbers consistent?"},
              {
                  "type": "input_image",
                  "image_url": "https://example.com/chart.png",
              },
              {
                  "type": "input_file",
                  "file_url": "https://example.com/data.xlsx",
                  "filename": "data.xlsx",
              },
          ],
      }],
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "model": "auto",
      "input": [
        {
          "type": "message",
          "role": "user",
          "content": [
            { "type": "input_text", "text": "Compare the chart in the image with the data in the spreadsheet. Are the numbers consistent?" },
            {
              "type": "input_image",
              "image_url": "https://example.com/chart.png"
            },
            {
              "type": "input_file",
              "file_url": "https://example.com/data.xlsx",
              "filename": "data.xlsx"
            }
          ]
        }
      ]
    }'
  ```
</CodeGroup>

## Next steps

* [Multimodal output](/docs/multimodal-output) — generate audio and images in responses
* [Files and vector stores](/docs/files-and-vector-stores) — upload and manage files for reuse
* [Generate a response](/docs/generate-a-response) — text-only requests and multi-turn exchanges
* [Advanced voice mode](/docs/advanced-voice-mode) — real-time voice conversations with LiveKit