Skip to content

Speech

(llm.speech)

Overview

Available Operations

transcribe

Convert audio to text using advanced speech recognition.

Upload Methods:

  1. Complete File Upload (Standard)

    • Use Content-Type: multipart/form-data
    • Upload the complete audio file in one request
    • Maximum file size: 25MB
    • Example (curl):
    bash
    curl -X POST "http://localhost:3000/api/v1/llm/speech/transcriptions?language=en" \
      -F "file=@audio.flac"
  2. Chunked Upload (Streaming)

    • Use Transfer-Encoding: chunked header
    • Stream audio data in chunks as it's being recorded
    • No need to know total file size upfront
    • Server buffers chunks until complete before processing
    • Maximum total size: 25MB
    • Example (curl):
    bash
    curl -X POST "http://localhost:3000/api/v1/llm/speech/transcriptions?language=en" \
      -H "Transfer-Encoding: chunked" \
      -H "Content-Type: multipart/form-data" \
      --data-binary @audio.flac

Supported Formats: FLAC, MP3, MP4, MPEG, MPGA, M4A, OGG, WAV, WebM

Query Parameters:

  • language (optional): ISO-639-1 language code (e.g., "en", "es", "fr"). Auto-detects if not specified.
  • prompt (optional): Text to guide transcription style
  • temperature (optional): Sampling temperature 0-1 (higher = more random)

Response: Returns transcribed text in JSON format.

Example Usage

typescript
import { SDK } from "@meetkai/mka1";
import { openAsBlob } from "node:fs";

const sdk = new SDK({
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const result = await sdk.llm.speech.transcribe({
    requestBody: {
      file: await openAsBlob("example.file"),
    },
  });

  console.log(result);
}

run();

Standalone function

The standalone function version of this method:

typescript
import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechTranscribe } from "@meetkai/mka1/funcs/llmSpeechTranscribe.js";
import { openAsBlob } from "node:fs";

// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const res = await llmSpeechTranscribe(sdk, {
    requestBody: {
      file: await openAsBlob("example.file"),
    },
  });
  if (res.ok) {
    const { value: result } = res;
    console.log(result);
  } else {
    console.log("llmSpeechTranscribe failed:", res.error);
  }
}

run();

React hooks and utilities

This method can be used in React components through the following hooks and associated utilities.

Check out this guide for information about each of the utilities below and how to get started using React hooks.

tsx
import {
  // Mutation hook for triggering the API call.
  useLlmSpeechTranscribeMutation
} from "@meetkai/mka1/react-query/llmSpeechTranscribe.js";

Parameters

ParameterTypeRequiredDescription
requestoperations.TranscribeRequest✔️The request object to use for the request.
optionsRequestOptionsUsed to set various options for making HTTP requests.
options.fetchOptionsRequestInitOptions that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All Request options, except method and body, are allowed.
options.retriesRetryConfigEnables retrying HTTP requests under certain failure conditions.

Response

Promise<operations.TranscribeResponseBody>

Errors

Error TypeStatus CodeContent Type
errors.APIError4XX, 5XX*/*

textToSpeech

Convert Urdu text to speech using the MK TTS service.

Request Body:

  • text: Input text in Urdu (Arabic script) - required
  • speaking_rate: Speech speed multiplier (default: 1.0, higher = faster, lower = slower)
  • noise_scale: Prosody/expressiveness variation (default: 0.667, 0.0 = monotone, higher = more varied)
  • noise_scale_duration: Timing/rhythm variation (default: 0.8)

Response: Returns audio file in WAV format

Example Usage

typescript
import { SDK } from "@meetkai/mka1";

const sdk = new SDK({
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const result = await sdk.llm.speech.textToSpeech({
    text: "<value>",
  });

  console.log(result);
}

run();

Standalone function

The standalone function version of this method:

typescript
import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechTextToSpeech } from "@meetkai/mka1/funcs/llmSpeechTextToSpeech.js";

// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const res = await llmSpeechTextToSpeech(sdk, {
    text: "<value>",
  });
  if (res.ok) {
    const { value: result } = res;
    console.log(result);
  } else {
    console.log("llmSpeechTextToSpeech failed:", res.error);
  }
}

run();

React hooks and utilities

This method can be used in React components through the following hooks and associated utilities.

Check out this guide for information about each of the utilities below and how to get started using React hooks.

tsx
import {
  // Mutation hook for triggering the API call.
  useLlmSpeechTextToSpeechMutation
} from "@meetkai/mka1/react-query/llmSpeechTextToSpeech.js";

Parameters

ParameterTypeRequiredDescription
requestoperations.TextToSpeechRequestBody✔️The request object to use for the request.
optionsRequestOptionsUsed to set various options for making HTTP requests.
options.fetchOptionsRequestInitOptions that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All Request options, except method and body, are allowed.
options.retriesRetryConfigEnables retrying HTTP requests under certain failure conditions.

Response

Promise<ReadableStream<Uint8Array>>

Errors

Error TypeStatus CodeContent Type
errors.APIError4XX, 5XX*/*