Criar uma execução de avaliação

from meetkai_mka1 import SDK, models with SDK( bearer_auth="<YOUR_BEARER_TOKEN_HERE>", ) as sdk: res = sdk.llm.evals.create_run(suite_id="eval_suite_aa87e2b1112a455b8deabed784372198", models=[ "auto", ], judge_model="auto", embedding_model="auto", generation=models.EvalGenerationConfig( temperature=0, max_gen_toks=512, until=[ "<|endoftext|>", ], do_sample=False, chat_template_kwargs={ "enable_thinking": False, }, timeout_seconds=120, max_retries=2, max_empty_retries=1, ), generation_concurrency=4, grader_concurrency=2, max_workflow_sample_activities=5000, metadata={ "purpose": "mvp", }) # Handle response print(res)

{ "id": "eval_run_aa87e2b1112a455b8deabed784372198", "object": "eval.run", "suite_id": "eval_suite_aa87e2b1112a455b8deabed784372198", "suite_version": 1, "suite_version_id": "eval_sver_aa87e2b1112a455b8deabed784372198", "status": "in_progress", "models": [ "auto" ], "task_ids": null, "judge_model": "auto", "embedding_model": "auto", "generation": { "temperature": 0, "max_output_tokens": 512 }, "request_counts": { "total": 100, "completed": 10, "failed": 0 }, "metrics": null, "error": null, "artifact_file_ids": [], "metadata": { "purpose": "mvp" }, "created_at": 1704067200, "started_at": 1704067210, "completed_at": null, "cancelled_at": null, "failed_at": null }

Autorizações

Authorization

string

header

obrigatório

Gateway auth: send Authorization: Bearer <mka1-api-key>. For multi-user server-side integrations, you can also send X-On-Behalf-Of: <external-user-id>.

Cabeçalhos

X-On-Behalf-Of

string

Optional external end-user identifier forwarded by the API gateway.

Corpo

application/json

suite_id

string

obrigatório

models

string[]

obrigatório

Required array length: 1 - 20 elements

Minimum string length: 1

suite_version

integer

Intervalo obrigatório: 1 <= x <= 9007199254740991

task_ids

string[]

Minimum array length: 1

judge_model

string

embedding_model

string

generation

object

Show child attributes

generation_concurrency

integer

Intervalo obrigatório: 1 <= x <= 256

concurrency

integer

Intervalo obrigatório: 1 <= x <= 256

grader_concurrency

integer

Intervalo obrigatório: 1 <= x <= 256

max_samples_per_task

integer

Intervalo obrigatório: 1 <= x <= 9007199254740991

max_workflow_sample_activities

integer

Reserva máxima de atividades de amostra por execução de fluxo de trabalho temporal antes de continuar como novo.

Intervalo obrigatório: 100 <= x <= 50000

metadata

object

Show child attributes

Resposta

200 - application/json

string

obrigatório

object

any

obrigatório

suite_id

string

obrigatório

suite_version

integer

obrigatório

Intervalo obrigatório: -9007199254740991 <= x <= 9007199254740991

suite_version_id

string

obrigatório

status

enum<string>

obrigatório

Opções disponíveis:

queued,

in_progress,

finalizing,

completed,

failed,

cancelling,

cancelled

models

string[]

obrigatório

task_ids

string[] | null

obrigatório

judge_model

string | null

obrigatório

embedding_model

string | null

obrigatório

generation

object

obrigatório

Show child attributes

request_counts

object

obrigatório

Show child attributes

metrics

object

obrigatório

Show child attributes

error

object

obrigatório

Show child attributes

artifact_file_ids

string[]

obrigatório

metadata

object

obrigatório

Show child attributes

created_at

integer

obrigatório

Intervalo obrigatório: -9007199254740991 <= x <= 9007199254740991

started_at

integer | null

obrigatório

Intervalo obrigatório: -9007199254740991 <= x <= 9007199254740991

completed_at

integer | null

obrigatório

Intervalo obrigatório: -9007199254740991 <= x <= 9007199254740991

cancelled_at

integer | null

obrigatório

Intervalo obrigatório: -9007199254740991 <= x <= 9007199254740991

failed_at

integer | null

obrigatório

Intervalo obrigatório: -9007199254740991 <= x <= 9007199254740991