Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mka1.com/llms.txt

Use this file to discover all available pages before exploring further.

The Batch API lets you send groups of requests as a single job that processes asynchronously. This is useful when you need to run many requests and do not need immediate results — for example, running evaluations, generating embeddings for a large dataset, or classifying content in bulk. Batch requests run within a 24-hour completion window and have separate, higher rate limits than synchronous API calls.

Supported endpoints

EndpointDescription
/v1/chat/completionsChat completion requests
/v1/embeddingsEmbedding generation
/v1/images/generationsImage generation
All requests in a single batch must target the same endpoint.

Lifecycle

A batch moves through these statuses:
validating → in_progress → finalizing → completed
     ↓            ↓
   failed     cancelling → cancelled
StatusDescription
validatingThe input file is being checked for format and content errors.
failedValidation failed — the input file contains errors. Check batch.errors for details.
in_progressRequests are being processed.
finalizingAll requests have been processed and the output files are being generated.
completedThe batch finished. Download results from output_file_id.
cancellingA cancel was requested. In-flight requests are finishing.
cancelledThe batch was cancelled. Partial results may be available.
expiredThe batch did not complete within the 24-hour window.

Step 1 — Prepare the input file

Create a JSONL file where each line is one request. Every line has four fields:
FieldTypeDescription
custom_idstringYour identifier for this request. Used to match input to output. Must be unique within the file.
methodstring"POST" — the only supported method.
urlstringThe endpoint path — must match the endpoint you declare when creating the batch.
bodyobjectThe request body — the same parameters you would send to the synchronous endpoint.
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "auto", "messages": [{"role": "user", "content": "Summarize the benefits of batch processing in one sentence."}], "max_tokens": 100}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "auto", "messages": [{"role": "user", "content": "What is the capital of France?"}], "max_tokens": 100}}
{"custom_id": "request-3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "auto", "messages": [{"role": "user", "content": "Explain embeddings in one paragraph."}], "max_tokens": 100}}
A single batch can contain up to 10,000 requests.

Step 2 — Upload the input file

Upload the JSONL file using the Files API with purpose: "batch".
mka1 llm files upload \
  --file ./batch_input.jsonl \
  --purpose batch \
  -H 'X-On-Behalf-Of: <end-user-id>'

Step 3 — Create the batch

Pass the uploaded file ID, the target endpoint, and the completion window.
mka1 llm batches create --body '{
  "input_file_id": "file_abc123",
  "endpoint": "/v1/chat/completions",
  "completion_window": "24h"
}'
You can also attach metadata for your own tracking:
mka1 llm batches create --body '{
  "input_file_id": "file_abc123",
  "endpoint": "/v1/chat/completions",
  "completion_window": "24h",
  "metadata": {
    "description": "nightly evaluation run",
    "run_id": "eval-2026-03-31"
  }
}'

Step 4 — Check batch status

Poll the batch until it reaches a terminal status.
mka1 llm batches get --batch-id batch_abc123
Here is a polling helper that waits for the batch to finish:
# Poll a batch until it reaches a terminal status using --jq and a shell loop.
BATCH_ID=batch_abc123
while :; do
  STATUS=$(mka1 llm batches get --batch-id "$BATCH_ID" --jq '.status' --output-format json)
  echo "status: $STATUS"
  case "$STATUS" in
    completed|failed|cancelled|expired) break ;;
  esac
  sleep 2
done

Step 5 — Download the results

Once the batch is completed, download the output file. It is a JSONL file where each line contains the custom_id you provided, the response, and any error.
# Download the JSONL output file
mka1 llm files content \
  --file-id file_xyz789 \
  --output-file ./batch_output.jsonl

# Inspect the results inline with jq
mka1 llm files content --file-id file_xyz789 \
  --jq '"\(.custom_id): status=\(.response.status_code)"'
Each line in the output file has this structure:
{
  "id": "response_abc123",
  "custom_id": "request-1",
  "response": {
    "status_code": 200,
    "request_id": "req_abc123",
    "body": { "...": "same shape as the synchronous endpoint response" }
  },
  "error": null
}
If a request failed, response is null and error contains the details:
{
  "id": "response_def456",
  "custom_id": "request-2",
  "response": null,
  "error": {
    "code": "processing_error",
    "message": "The request could not be processed."
  }
}
If any requests failed, the batch also provides an error_file_id containing only the failed entries.

Cancel a batch

Cancel a batch that is still in progress. Requests that have already completed remain in the output.
mka1 llm batches cancel --batch-id batch_abc123
The batch transitions to cancelling while in-flight requests finish, then to cancelled.

List batches

Retrieve all batches for the current account, newest first. Supports pagination.
mka1 llm batches list --limit 20
Use the after parameter with a batch ID to page through results.

Example: batch embeddings

The same flow works for embeddings. Change the url in each JSONL line and the endpoint when creating the batch.
{"custom_id": "embed-1", "method": "POST", "url": "/v1/embeddings", "body": {"model": "auto", "input": "The quick brown fox"}}
{"custom_id": "embed-2", "method": "POST", "url": "/v1/embeddings", "body": {"model": "auto", "input": "jumps over the lazy dog"}}
# Upload the embeddings JSONL input
FILE_ID=$(mka1 llm files upload \
  --file ./embed_batch.jsonl \
  --purpose batch \
  --jq '.id' --output-format json | tr -d '"')

# Create the batch against the embeddings endpoint
mka1 llm batches create --body "{
  \"input_file_id\": \"$FILE_ID\",
  \"endpoint\": \"/v1/embeddings\",
  \"completion_window\": \"24h\"
}"

# Poll, then download the results — see Steps 4 and 5

Validation errors

If the input file has formatting issues, the batch moves to failed immediately. Common causes:
  • Invalid JSON — a line is not valid JSON.
  • Missing fields — a line is missing custom_id, method, url, or body.
  • Wrong methodmethod must be "POST".
  • URL mismatch — the url in a line does not match the endpoint declared when creating the batch.
  • Duplicate custom_id — each custom_id must be unique within the file.
Check batch.errors.data for the specific error messages and line numbers.
mka1 llm batches get --batch-id batch_abc123 \
  --jq '.errors.data[] | "Line \(.line): [\(.code)] \(.message)"'

See also