> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mka1.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Saída multimodal

> Gere fala em áudio e imagens usando a API MKA1 com o recurso Responses.

A API MKA1 pode retornar texto, áudio e imagens. Texto é a modalidade de saída padrão.
Use `modalities` e `audio` para ativar a saída de fala, ou adicione a ferramenta `image_generation` para produzir imagens.

## Tipos de saída suportados

| Modalidade   | Como ativar                              | Formato de saída              |
| ------------ | ---------------------------------------- | ----------------------------- |
| Texto        | Padrão — sem configuração extra          | `output_text` na resposta     |
| Áudio (fala) | Defina `modalities: ["text", "audio"]`   | Áudio em Base64 + transcrição |
| Imagem       | Adicione a ferramenta `image_generation` | URL da imagem ou base64       |

## Gerar áudio (texto para fala)

Solicite saída de áudio definindo `modalities` para `["text", "audio"]` e especificando uma voz e formato no parâmetro `audio`. A resposta inclui tanto a transcrição do texto quanto os dados de áudio codificados em base64.

### Configuração de áudio

| Parâmetro | Opções                                | Padrão  |
| --------- | ------------------------------------- | ------- |
| `voice`   | `alloy` e outros perfis de voz        | `alloy` |
| `format`  | `wav`, `mp3`, `flac`, `opus`, `pcm16` | `wav`   |

O áudio é sintetizado a 24 kHz, 16-bit mono.

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create \
    -H 'X-On-Behalf-Of: <end-user-id>' \
    --body '{
      "model": "meetkai:functionary-pt",
      "input": "Say hello in a friendly way. Keep it very short.",
      "modalities": ["text", "audio"],
      "audio": { "voice": "alloy", "format": "wav" }
    }'
  ```

  ```ts MKA1 SDK theme={null}
  import { SDK } from '@meetkai/mka1';

  const mka1 = new SDK({
    bearerAuth: `Bearer ${YOUR_API_KEY}`,
  });

  const result = await mka1.llm.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'Say hello in a friendly way. Keep it very short.',
    modalities: ['text', 'audio'],
    audio: { voice: 'alloy', format: 'wav' },
  }, { headers: { 'X-On-Behalf-Of': '<end-user-id>' } });

  // The output includes an output_audio item with base64 data and a transcript
  ```

  ```ts OpenAI SDK theme={null}
  import OpenAI from 'openai';

  const openai = new OpenAI({
    apiKey: '<mka1-api-key>',
    baseURL: 'https://apigw.mka1.com/api/v1/llm/',
    defaultHeaders: { 'X-On-Behalf-Of': '<end-user-id>' },
  });

  const response = await openai.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'Say hello in a friendly way. Keep it very short.',
    modalities: ['text', 'audio'],
    audio: { voice: 'alloy', format: 'wav' },
    stream: false,
  });

  // Find the audio output
  const audioItem = response.output.find((item) => item.type === 'output_audio');
  // audioItem.data contains base64-encoded WAV
  // audioItem.transcript contains the spoken text
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: $"Bearer {YOUR_API_KEY}",
      serverUrl: "https://apigw.mka1.com"
  );

  var result = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "meetkai:functionary-pt",
      Input = ResponsesCreateRequestInput.CreateStr("Say hello in a friendly way. Keep it very short."),
      Modalities = new List<ResponsesCreateRequestModality>
      {
          ResponsesCreateRequestModality.Text,
          ResponsesCreateRequestModality.Audio,
      },
      Audio = new Audio()
      {
          Voice = "alloy",
          Format = ResponsesCreateRequestFormat.Wav,
      },
  });

  // The output includes an output_audio item with base64 data and a transcript
  ```

  ```python Python SDK theme={null}
  from mka1 import SDK

  sdk = SDK(bearer_auth="Bearer YOUR_API_KEY")

  result = sdk.llm.responses.create(
      model="meetkai:functionary-pt",
      input="Say hello in a friendly way. Keep it very short.",
      modalities=["text", "audio"],
      audio={"voice": "alloy", "format": "wav"},
  )

  # The output includes an output_audio item with base64 data and a transcript
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "model": "meetkai:functionary-pt",
      "input": "Say hello in a friendly way. Keep it very short.",
      "modalities": ["text", "audio"],
      "audio": { "voice": "alloy", "format": "wav" }
    }'
  ```
</CodeGroup>

A resposta contém um item `output_audio` com o áudio codificado em base64 e uma transcrição do que foi falado:

```json theme={null}
{
  "status": "completed",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        { "type": "output_text", "text": "Hello!" }
      ]
    },
    {
      "type": "output_audio",
      "id": "audio_460caf1079b34fa0b4aa74448dff4ea7",
      "data": "<Base64-encoded WAV audio data>",
      "transcript": "Hi there!",
      "status": "completed"
    }
  ]
}
```

O campo `data` contém o arquivo de áudio completo (268 KB neste exemplo). O campo `transcript` contém o texto que o modelo escolheu para falar — que pode ser ligeiramente diferente do texto de saída.

### Salvar áudio em um arquivo

<CodeGroup>
  ```bash CLI theme={null}
  # Gere áudio e extraia os dados base64, depois decodifique para um arquivo
  mka1 llm responses create \
    --body '{
      "model": "meetkai:functionary-pt",
      "input": "Read this sentence aloud: The quick brown fox jumps over the lazy dog.",
      "modalities": ["text", "audio"],
      "audio": { "voice": "alloy", "format": "mp3" }
    }' \
    --output-format json \
    --jq '.output[] | select(.type == "output_audio") | .data' | base64 -d > output.mp3
  ```

  ```ts MKA1 SDK theme={null}
  import { writeFileSync } from 'fs';

  const result = await mka1.llm.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'Read this sentence aloud: The quick brown fox jumps over the lazy dog.',
    modalities: ['text', 'audio'],
    audio: { voice: 'alloy', format: 'mp3' },
  });

  // Find the audio output in the response
  const audioItem = result.output.find((item) => item.type === 'output_audio');
  if (audioItem) {
    const audioBuffer = Buffer.from(audioItem.data, 'base64');
    writeFileSync('output.mp3', audioBuffer);
  }
  ```

  ```ts OpenAI SDK theme={null}
  import { writeFileSync } from 'fs';

  const response = await openai.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'Read this sentence aloud: The quick brown fox jumps over the lazy dog.',
    modalities: ['text', 'audio'],
    audio: { voice: 'alloy', format: 'mp3' },
    stream: false,
  });

  const audioItem = response.output.find((item) => item.type === 'output_audio');
  if (audioItem) {
    const audioBuffer = Buffer.from(audioItem.data, 'base64');
    writeFileSync('output.mp3', audioBuffer);
  }
  ```

  ```csharp C# SDK theme={null}
  var result = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "meetkai:functionary-pt",
      Input = ResponsesCreateRequestInput.CreateStr(
          "Read this sentence aloud: The quick brown fox jumps over the lazy dog."),
      Modalities = new List<ResponsesCreateRequestModality>
      {
          ResponsesCreateRequestModality.Text,
          ResponsesCreateRequestModality.Audio,
      },
      Audio = new Audio()
      {
          Voice = "alloy",
          Format = ResponsesCreateRequestFormat.Mp3,
      },
  });

  // Save the audio output to a file
  // (iterate result.Output to find the output_audio item and decode its base64 data)
  ```

  ```python Python SDK theme={null}
  import base64

  result = sdk.llm.responses.create(
      model="meetkai:functionary-pt",
      input="Read this sentence aloud: The quick brown fox jumps over the lazy dog.",
      modalities=["text", "audio"],
      audio={"voice": "alloy", "format": "mp3"},
  )

  # Find the audio output in the response
  for item in result.output:
      if item.type == "output_audio":
          audio_bytes = base64.b64decode(item.data)
          with open("output.mp3", "wb") as f:
              f.write(audio_bytes)
  ```

  ```bash bash theme={null}
  # Gere áudio e extraia os dados base64, depois decodifique para um arquivo
  curl -s https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --data '{
      "model": "meetkai:functionary-pt",
      "input": "Read this sentence aloud: The quick brown fox jumps over the lazy dog.",
      "modalities": ["text", "audio"],
      "audio": { "voice": "alloy", "format": "mp3" }
    }' | jq -r '.output[] | select(.type == "output_audio") | .data' | base64 -d > output.mp3
  ```
</CodeGroup>

### Idiomas suportados

A saída de áudio suporta detecção automática de idioma e mais de 20 idiomas, incluindo inglês, chinês, hindi, espanhol, árabe, bengali, português, russo, japonês, punjabi, alemão, coreano, francês, turco, italiano, tailandês, polonês, holandês, indonésio, vietnamita e urdu.

## Gerar imagens

Use a ferramenta `image_generation` para criar imagens a partir de prompts de texto. O modelo interpreta sua mensagem, gera um prompt para o modelo de imagem e retorna o resultado.

### Modelos de geração de imagem

| Modelo                  | Melhor para                            |
| ----------------------- | -------------------------------------- |
| `meetkai:flux-2-klein`  | Geração rápida, uso geral (padrão)     |
| `meetkai:z-image-turbo` | Imagens de alta qualidade e detalhadas |

### Opções de geração de imagem

| Parâmetro       | Opções                                                          | Padrão                   |
| --------------- | --------------------------------------------------------------- | ------------------------ |
| `size`          | `1024x1024`, `1024x1536`, `1536x1024`, `meetkai:functionary-pt` | `meetkai:functionary-pt` |
| `quality`       | `low`, `medium`, `high`, `meetkai:functionary-pt`               | `meetkai:functionary-pt` |
| `output_format` | `png`, `webp`, `jpeg`                                           | `png`                    |
| `background`    | `transparent`, `opaque`, `meetkai:functionary-pt`               | `meetkai:functionary-pt` |

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create --body '{
    "model": "meetkai:functionary-pt",
    "input": "Generate an image of a sunset over a mountain lake.",
    "tools": [
      {
        "type": "image_generation",
        "model": "meetkai:functionary-pt",
        "quality": "high",
        "size": "1024x1024",
        "output_format": "png"
      }
    ]
  }'
  ```

  ```ts MKA1 SDK theme={null}
  import { SDK } from '@meetkai/mka1';

  const mka1 = new SDK({
    bearerAuth: `Bearer ${YOUR_API_KEY}`,
  });

  const result = await mka1.llm.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'Generate an image of a sunset over a mountain lake.',
    tools: [
      {
        type: 'image_generation',
        model: 'meetkai:functionary-pt',
        quality: 'high',
        size: '1024x1024',
        output_format: 'png',
      },
    ],
  }, { headers: { 'X-On-Behalf-Of': '<end-user-id>' } });

  // The output includes an image_generation_call item with a result URL
  const imageCall = result.output.find((item) => item.type === 'image_generation_call');
  console.log('Image URL:', imageCall?.result);
  ```

  ```ts OpenAI SDK theme={null}
  import OpenAI from 'openai';

  const openai = new OpenAI({
    apiKey: '<mka1-api-key>',
    baseURL: 'https://apigw.mka1.com/api/v1/llm/',
    defaultHeaders: { 'X-On-Behalf-Of': '<end-user-id>' },
  });

  const response = await openai.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'Generate an image of a sunset over a mountain lake.',
    tools: [{ type: 'image_generation' }],
    stream: false,
  });

  const imageCall = response.output.find((item) => item.type === 'image_generation_call');
  console.log('Image URL:', imageCall?.result);
  ```

  ```csharp C# SDK theme={null}
  var result = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "meetkai:functionary-pt",
      Input = ResponsesCreateRequestInput.CreateStr(
          "Generate an image of a sunset over a mountain lake."),
      Tools = new List<ResponsesCreateRequestTool>
      {
          ResponsesCreateRequestTool.CreateImageGenerationToolDefinition(
              new ImageGenerationToolDefinition()
              {
                  Model = "meetkai:flux2-klein",
                  Quality = ImageGenerationToolDefinitionQuality.High,
                  Size = ImageGenerationToolDefinitionSize.OneThousandAndTwentyFourx1024,
                  OutputFormat = OutputFormat.Png,
              }
          ),
      },
  });

  // The output includes an image_generation_call item with a result URL
  ```

  ```python Python SDK theme={null}
  result = sdk.llm.responses.create(
      model="meetkai:functionary-pt",
      input="Generate an image of a sunset over a mountain lake.",
      tools=[
          {
              "type": "image_generation",
              "model": "meetkai:functionary-pt",
              "quality": "high",
              "size": "1024x1024",
              "output_format": "png",
          },
      ],
  )

  # The output includes an image_generation_call item with a result URL
  for item in result.output:
      if item.type == "image_generation_call":
          print("Image URL:", item.result)
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "model": "meetkai:functionary-pt",
      "input": "Generate an image of a sunset over a mountain lake.",
      "tools": [
        {
          "type": "image_generation",
          "model": "meetkai:functionary-pt",
          "quality": "high",
          "size": "1024x1024",
          "output_format": "png"
        }
      ]
    }'
  ```
</CodeGroup>

A resposta inclui um item `image_generation_call` com a URL da imagem gerada e o prompt revisado usado pelo modelo de imagem:

```json theme={null}
{
  "status": "completed",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "I'll generate an image of a beautiful sunset over a mountain lake for you."
        }
      ]
    },
    {
      "type": "image_generation_call",
      "id": "ig_abc123",
      "status": "completed",
      "result": "<Generated Image URL>",
      "revised_prompt": "A breathtaking sunset over a pristine mountain lake, with golden and orange hues reflecting on the calm water surface. Snow-capped mountain peaks in the background, dramatic clouds in the sky with vibrant sunset colors of pink, purple, and orange.",
      "size": "meetkai:functionary-pt",
      "quality": "meetkai:functionary-pt",
      "output_format": "png"
    }
  ]
}
```

O campo `result` contém uma URL para a imagem gerada. O campo `revised_prompt` mostra o prompt expandido que o modelo de imagem usou — o LLM aprimora sua instrução breve em uma descrição detalhada da imagem.

### Forçar geração de imagem

Use `tool_choice` para garantir que o modelo gere uma imagem em vez de responder apenas com texto.

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm responses create --body '{
    "model": "meetkai:functionary-pt",
    "input": "A red circle on a white background.",
    "tools": [{ "type": "image_generation" }],
    "tool_choice": { "type": "image_generation" }
  }'
  ```

  ```ts MKA1 SDK theme={null}
  const result = await mka1.llm.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'A red circle on a white background.',
    tools: [{ type: 'image_generation' }],
    toolChoice: { type: 'image_generation' },
  });
  ```

  ```ts OpenAI SDK theme={null}
  const response = await openai.responses.create({
    model: 'meetkai:functionary-pt',
    input: 'A red circle on a white background.',
    tools: [{ type: 'image_generation' }],
    tool_choice: { type: 'image_generation' },
    stream: false,
  });
  ```

  ```csharp C# SDK theme={null}
  var result = await sdk.Llm.Responses.CreateAsync(new ResponsesCreateRequest()
  {
      Model = "meetkai:functionary-pt",
      Input = ResponsesCreateRequestInput.CreateStr("A red circle on a white background."),
      Tools = new List<ResponsesCreateRequestTool>
      {
          ResponsesCreateRequestTool.CreateImageGenerationToolDefinition(
              new ImageGenerationToolDefinition()
          ),
      },
      ToolChoice = ToolChoice.CreateHostedToolChoice(new HostedToolChoice()
      {
          Type = HostedToolChoiceType.ImageGeneration,
      }),
  });
  ```

  ```python Python SDK theme={null}
  result = sdk.llm.responses.create(
      model="meetkai:functionary-pt",
      input="A red circle on a white background.",
      tools=[{"type": "image_generation"}],
      tool_choice={"type": "image_generation"},
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/responses \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --data '{
      "model": "meetkai:functionary-pt",
      "input": "A red circle on a white background.",
      "tools": [{ "type": "image_generation" }],
      "tool_choice": { "type": "image_generation" }
    }'
  ```
</CodeGroup>

### Estrutura da saída de imagem

O array `output` da resposta contém estes itens quando uma imagem é gerada:

1. `function_call` — chamada do modelo para a ferramenta de geração de imagem com o prompt refinado
2. `image_generation_call` — resultado da geração com `status: "completed"` e `result` (URL da imagem)
3. `function_call_output` — saída bruta da ferramenta contendo a URL
4. `message` — resposta textual do modelo descrevendo ou referenciando a imagem

As URLs das imagens expiram após 1 hora. Baixe ou faça cache delas se precisar de acesso prolongado.

## APIs independentes

Para acesso direto sem passar pela API de Responses, a MKA1 também fornece endpoints independentes:

### API de texto para fala

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm speech speak \
    --text 'Hello, welcome to the MKA1 platform.' \
    --language en \
    --output-file output.wav
  ```

  ```ts MKA1 SDK theme={null}
  const ttsResult = await mka1.llm.speech.speak({
    text: 'Hello, welcome to the MKA1 platform.',
    language: 'en',
  });
  ```

  ```csharp C# SDK theme={null}
  var res = await sdk.Llm.Speech.SpeakAsync(new TextToSpeechRequest()
  {
      Text = "Hello, welcome to the MKA1 platform.",
  });
  ```

  ```python Python SDK theme={null}
  result = sdk.llm.speech.speak(
      text="Hello, welcome to the MKA1 platform.",
      language="en",
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/speech/tts \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --data '{
      "text": "Hello, welcome to the MKA1 platform.",
      "language": "en"
    }'
  ```
</CodeGroup>

### API de imagens

<CodeGroup>
  ```bash CLI theme={null}
  mka1 llm images create \
    --model meetkai:functionary-pt \
    --prompt 'A futuristic city skyline at dusk' \
    --size 1024x1024 \
    --quality hd
  ```

  ```ts MKA1 SDK theme={null}
  const imageResult = await mka1.llm.images.generate({
    model: 'meetkai:functionary-pt',
    prompt: 'A futuristic city skyline at dusk',
    size: '1024x1024',
    quality: 'hd',
  });
  ```

  ```csharp C# SDK theme={null}
  var imageResult = await sdk.Llm.Images.CreateAsync(new ImageGenerationRequest()
  {
      Model = "meetkai:z-image-turbo",
      Prompt = "A futuristic city skyline at dusk",
      Size = ImageGenerationRequestSize.OneThousandAndTwentyFourx1024,
      Quality = ImageGenerationRequestQuality.Hd,
  });
  ```

  ```python Python SDK theme={null}
  image_result = sdk.llm.images.create(
      model="meetkai:functionary-pt",
      prompt="A futuristic city skyline at dusk",
      size="1024x1024",
      quality="hd",
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/llm/images/generations \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --data '{
      "model": "meetkai:functionary-pt",
      "prompt": "A futuristic city skyline at dusk",
      "size": "1024x1024",
      "quality": "hd"
    }'
  ```
</CodeGroup>

## Próximos passos

* [Entrada multimodal](/pt/docs/multimodal-input) — envie imagens, áudio e documentos para o modelo
* [Fala](/pt/docs/speech) — transcreva áudio e gere fala com os endpoints independentes de fala
* [Modo avançado de voz](/pt/docs/advanced-voice-mode) — conversas de voz em tempo real com LiveKit
* [Gerar uma resposta](/pt/docs/generate-a-response) — solicitações de texto e trocas multi-turno
