Generate a response

Use the Responses resource when you want the MKA1 API to return text. Start with a plain string for simple prompts. Use message items when you need explicit roles or conversation state.

Send a simple prompt

Pass a string in input for a single-turn request. The response includes generated text in output_text.

mka1 llm responses create \
  --model auto \
  --input '"Write a one-sentence summary of the MKA1 API."' \
  -H 'X-On-Behalf-Of: <end-user-id>'

If you are not acting for an end user, omit X-On-Behalf-Of.

Add instructions

Use instructions to define behavior before the model sees the user input. Keep instructions short and specific.

mka1 llm responses create \
  --model auto \
  --instructions 'You are a support assistant. Reply in plain English. Keep answers under 80 words.' \
  --input '"Explain what embeddings are used for."'

Send structured messages

Use an array of message items in input when you want explicit roles. Each message item uses type, role, and content.

mka1 llm responses create --body '{
  "model": "auto",
  "input": [
    { "type": "message", "role": "developer", "content": "Answer as a technical writer. Keep the reply concise." },
    { "type": "message", "role": "user", "content": "Draft a short product update about faster response times." }
  ]
}'

This pattern is useful when you want the request body to carry the message history directly.

Continue a multi-turn exchange

Use previous_response_id to continue from an earlier response without resending the full history.

mka1 llm responses create \
  --model auto \
  --previous-response-id resp_123 \
  --input '"Now turn that into an email subject line."'

If you need a reusable conversation container, create one with the Conversations resource and then pass the conversation ID in conversation.

# Create a conversation
mka1 llm conversations create --body '{
  "metadata": { "session_id": "web-42" }
}'

# Use the conversation in a response request
mka1 llm responses create \
  --model auto \
  --conversation conv_123 \
  --input '"What should I ask next to refine this draft?"'

See the Conversations and Responses pages in the API Reference for the full resource workflow.

Stream text as it is generated

Set stream to true to receive server-sent events instead of waiting for the full response.

mka1 llm responses create \
  --model auto \
  --input '"Write three release notes bullets for our docs update."' \
  --stream

Use streaming when you want to render partial output as it arrives.

Next steps

Review the API overview for authentication and base URL details
See background responses when you need to offload long-running work and poll or stream for results
See manage conversations to organize multi-turn exchanges into reusable conversation containers
See spawn subagents using the Responses API for a recipe that runs nested Responses calls as tool work

Getting started

Responses

Features

CLI

Recipes

Benchmarks

Infrastructure

Generate a response

Send a simple prompt

Add instructions

Send structured messages

Continue a multi-turn exchange

Stream text as it is generated

Next steps

Getting started

Responses

Features

CLI

Recipes

Benchmarks

Infrastructure

Documentation Index

​Send a simple prompt

​Add instructions

​Send structured messages

​Continue a multi-turn exchange

​Stream text as it is generated

​Next steps

Send a simple prompt

Add instructions

Send structured messages

Continue a multi-turn exchange

Stream text as it is generated

Next steps