Batch processing

The Batch API lets you send groups of requests as a single job that processes asynchronously. This is useful when you need to run many requests and do not need immediate results — for example, running evaluations, generating embeddings for a large dataset, or classifying content in bulk. Batch requests run within a 24-hour completion window and have separate, higher rate limits than synchronous API calls.

Supported endpoints

Endpoint	Description
`/v1/chat/completions`	Chat completion requests
`/v1/embeddings`	Embedding generation
`/v1/images/generations`	Image generation

All requests in a single batch must target the same endpoint.

Lifecycle

A batch moves through these statuses:

validating → in_progress → finalizing → completed
     ↓            ↓
   failed     cancelling → cancelled

Status	Description
`validating`	The input file is being checked for format and content errors.
`failed`	Validation failed — the input file contains errors. Check `batch.errors` for details.
`in_progress`	Requests are being processed.
`finalizing`	All requests have been processed and the output files are being generated.
`completed`	The batch finished. Download results from `output_file_id`.
`cancelling`	A cancel was requested. In-flight requests are finishing.
`cancelled`	The batch was cancelled. Partial results may be available.
`expired`	The batch did not complete within the 24-hour window.

Step 1 — Prepare the input file

Create a JSONL file where each line is one request. Every line has four fields:

Field	Type	Description
`custom_id`	string	Your identifier for this request. Used to match input to output. Must be unique within the file.
`method`	string	`"POST"` — the only supported method.
`url`	string	The endpoint path — must match the `endpoint` you declare when creating the batch.
`body`	object	The request body — the same parameters you would send to the synchronous endpoint.

{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "auto", "messages": [{"role": "user", "content": "Summarize the benefits of batch processing in one sentence."}], "max_tokens": 100}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "auto", "messages": [{"role": "user", "content": "What is the capital of France?"}], "max_tokens": 100}}
{"custom_id": "request-3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "auto", "messages": [{"role": "user", "content": "Explain embeddings in one paragraph."}], "max_tokens": 100}}

A single batch can contain up to 10,000 requests.

Step 2 — Upload the input file

Upload the JSONL file using the Files API with purpose: "batch".

mka1 llm files upload \
  --file ./batch_input.jsonl \
  --purpose batch \
  -H 'X-On-Behalf-Of: <end-user-id>'

Step 3 — Create the batch

Pass the uploaded file ID, the target endpoint, and the completion window.

mka1 llm batches create --body '{
  "input_file_id": "file_abc123",
  "endpoint": "/v1/chat/completions",
  "completion_window": "24h"
}'

You can also attach metadata for your own tracking:

mka1 llm batches create --body '{
  "input_file_id": "file_abc123",
  "endpoint": "/v1/chat/completions",
  "completion_window": "24h",
  "metadata": {
    "description": "nightly evaluation run",
    "run_id": "eval-2026-03-31"
  }
}'

Step 4 — Check batch status

Poll the batch until it reaches a terminal status.

mka1 llm batches get --batch-id batch_abc123

Here is a polling helper that waits for the batch to finish:

# Poll a batch until it reaches a terminal status using --jq and a shell loop.
BATCH_ID=batch_abc123
while :; do
  STATUS=$(mka1 llm batches get --batch-id "$BATCH_ID" --jq '.status' --output-format json)
  echo "status: $STATUS"
  case "$STATUS" in
    completed|failed|cancelled|expired) break ;;
  esac
  sleep 2
done

Step 5 — Download the results

Once the batch is completed, download the output file. It is a JSONL file where each line contains the custom_id you provided, the response, and any error.

# Download the JSONL output file
mka1 llm files content \
  --file-id file_xyz789 \
  --output-file ./batch_output.jsonl

# Inspect the results inline with jq
mka1 llm files content --file-id file_xyz789 \
  --jq '"\(.custom_id): status=\(.response.status_code)"'

Each line in the output file has this structure:

{
  "id": "response_abc123",
  "custom_id": "request-1",
  "response": {
    "status_code": 200,
    "request_id": "req_abc123",
    "body": { "...": "same shape as the synchronous endpoint response" }
  },
  "error": null
}

If a request failed, response is null and error contains the details:

{
  "id": "response_def456",
  "custom_id": "request-2",
  "response": null,
  "error": {
    "code": "processing_error",
    "message": "The request could not be processed."
  }
}

If any requests failed, the batch also provides an error_file_id containing only the failed entries.

Cancel a batch

Cancel a batch that is still in progress. Requests that have already completed remain in the output.

mka1 llm batches cancel --batch-id batch_abc123

The batch transitions to cancelling while in-flight requests finish, then to cancelled.

List batches

Retrieve all batches for the current account, newest first. Supports pagination.

mka1 llm batches list --limit 20

Use the after parameter with a batch ID to page through results.

Example: batch embeddings

The same flow works for embeddings. Change the url in each JSONL line and the endpoint when creating the batch.

{"custom_id": "embed-1", "method": "POST", "url": "/v1/embeddings", "body": {"model": "auto", "input": "The quick brown fox"}}
{"custom_id": "embed-2", "method": "POST", "url": "/v1/embeddings", "body": {"model": "auto", "input": "jumps over the lazy dog"}}

# Upload the embeddings JSONL input
FILE_ID=$(mka1 llm files upload \
  --file ./embed_batch.jsonl \
  --purpose batch \
  --jq '.id' --output-format json | tr -d '"')

# Create the batch against the embeddings endpoint
mka1 llm batches create --body "{
  \"input_file_id\": \"$FILE_ID\",
  \"endpoint\": \"/v1/embeddings\",
  \"completion_window\": \"24h\"
}"

# Poll, then download the results — see Steps 4 and 5

Validation errors

If the input file has formatting issues, the batch moves to failed immediately. Common causes:

Invalid JSON — a line is not valid JSON.
Missing fields — a line is missing custom_id, method, url, or body.
Wrong method — method must be "POST".
URL mismatch — the url in a line does not match the endpoint declared when creating the batch.
Duplicate custom_id — each custom_id must be unique within the file.

Check batch.errors.data for the specific error messages and line numbers.

mka1 llm batches get --batch-id batch_abc123 \
  --jq '.errors.data[] | "Line \(.line): [\(.code)] \(.message)"'

Getting started

Responses

Features

CLI

Recipes

Benchmarks

Infrastructure

Batch processing

Supported endpoints

Lifecycle

Step 1 — Prepare the input file

Step 2 — Upload the input file

Step 3 — Create the batch

Step 4 — Check batch status

Step 5 — Download the results

Cancel a batch

List batches

Example: batch embeddings

Validation errors

See also

Getting started

Responses

Features

CLI

Recipes

Benchmarks

Infrastructure

Documentation Index

​Supported endpoints

​Lifecycle

​Step 1 — Prepare the input file

​Step 2 — Upload the input file

​Step 3 — Create the batch

​Step 4 — Check batch status

​Step 5 — Download the results

​Cancel a batch

​List batches

​Example: batch embeddings

​Validation errors

​See also

Supported endpoints

Lifecycle

Step 1 — Prepare the input file

Step 2 — Upload the input file

Step 3 — Create the batch

Step 4 — Check batch status

Step 5 — Download the results

Cancel a batch

List batches

Example: batch embeddings

Validation errors

See also