Llm.Speech

Overview

Available Operations

transcribe - Speech to text transcription
speak - Text to speech
speakStreaming - Streaming text to speech

transcribe

Convert audio to text using advanced speech recognition.

Upload Methods:

Complete File Upload (Standard)
- Use Content-Type: multipart/form-data
- Upload the complete audio file in one request
- Maximum file size: 25MB
- Example (curl):
bash
```
curl -X POST "http://localhost:3000/api/v1/llm/speech/transcriptions?language=en" \
  -F "file=@audio.flac"
```
Chunked Upload (Streaming)
- Use Transfer-Encoding: chunked header
- Stream audio data in chunks as it's being recorded
- No need to know total file size upfront
- Server buffers chunks until complete before processing
- Maximum total size: 25MB
- Example (curl):
bash
```
curl -X POST "http://localhost:3000/api/v1/llm/speech/transcriptions?language=en" \
  -H "Transfer-Encoding: chunked" \
  -H "Content-Type: multipart/form-data" \
  --data-binary @audio.flac
```

Supported Formats: FLAC, MP3, MP4, MPEG, MPGA, M4A, OGG, WAV, WebM

Query Parameters:

language (optional): ISO-639-1 language code (e.g., "en", "es", "fr"). Auto-detects if not specified.
prompt (optional): Text to guide transcription style
temperature (optional): Sampling temperature 0-1 (higher = more random)

Response: Returns transcribed text in JSON format.

Example Usage

typescript

import { SDK } from "@meetkai/mka1";
import { openAsBlob } from "node:fs";

const sdk = new SDK({
  serverURL: "https://api.example.com",
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const result = await sdk.llm.speech.transcribe({
    requestBody: {
      file: await openAsBlob("example.file"),
    },
  });

  console.log(result);
}

run();

Standalone function

The standalone function version of this method:

typescript

import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechTranscribe } from "@meetkai/mka1/funcs/llmSpeechTranscribe.js";
import { openAsBlob } from "node:fs";

// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
  serverURL: "https://api.example.com",
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const res = await llmSpeechTranscribe(sdk, {
    requestBody: {
      file: await openAsBlob("example.file"),
    },
  });
  if (res.ok) {
    const { value: result } = res;
    console.log(result);
  } else {
    console.log("llmSpeechTranscribe failed:", res.error);
  }
}

run();

React hooks and utilities

This method can be used in React components through the following hooks and associated utilities.

Check out this guide for information about each of the utilities below and how to get started using React hooks.

tsx

import {
  // Mutation hook for triggering the API call.
  useLlmSpeechTranscribeMutation
} from "@meetkai/mka1/react-query/llmSpeechTranscribe.js";

Parameters

Parameter	Type	Required	Description
`request`	operations.TranscribeRequest	✔️	The request object to use for the request.
`options`	RequestOptions	➖	Used to set various options for making HTTP requests.
`options.fetchOptions`	RequestInit	➖	Options that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All `Request` options, except `method` and `body`, are allowed.
`options.retries`	RetryConfig	➖	Enables retrying HTTP requests under certain failure conditions.

Response

Promise<components.TranscriptionResponse>

Errors

Error Type	Status Code	Content Type
errors.APIError	4XX, 5XX	/

speak

Convert text to speech with automatic language detection.

Request Body:

text: Input text to convert to speech - required
language: Language code (default: "auto") - "auto" for automatic detection, or ISO 639-1 codes: en, zh, hi, es, ar, bn, pt, ru, ja, pa, de, ko, fr, tr, it, th, pl, nl, id, vi, ur

Response: Returns audio file in WAV format with X-Language-Code header

Example Usage

typescript

import { SDK } from "@meetkai/mka1";

const sdk = new SDK({
  serverURL: "https://api.example.com",
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const result = await sdk.llm.speech.speak({
    text: "<value>",
  });

  console.log(result);
}

run();

Standalone function

The standalone function version of this method:

typescript

import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechSpeak } from "@meetkai/mka1/funcs/llmSpeechSpeak.js";

// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
  serverURL: "https://api.example.com",
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const res = await llmSpeechSpeak(sdk, {
    text: "<value>",
  });
  if (res.ok) {
    const { value: result } = res;
    console.log(result);
  } else {
    console.log("llmSpeechSpeak failed:", res.error);
  }
}

run();

React hooks and utilities

This method can be used in React components through the following hooks and associated utilities.

Check out this guide for information about each of the utilities below and how to get started using React hooks.

tsx

import {
  // Mutation hook for triggering the API call.
  useLlmSpeechSpeakMutation
} from "@meetkai/mka1/react-query/llmSpeechSpeak.js";

Parameters

Parameter	Type	Required	Description
`request`	components.TextToSpeechRequest	✔️	The request object to use for the request.
`options`	RequestOptions	➖	Used to set various options for making HTTP requests.
`options.fetchOptions`	RequestInit	➖	Options that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All `Request` options, except `method` and `body`, are allowed.
`options.retries`	RetryConfig	➖	Enables retrying HTTP requests under certain failure conditions.

Response

Promise<operations.TextToSpeechResponse>

Errors

Error Type	Status Code	Content Type
errors.APIError	4XX, 5XX	/

speakStreaming

Convert text to speech with real-time streaming audio delivery.

Key Features:

Low-latency audio streaming - playback can start immediately as chunks arrive
Automatic language detection
Multiple format support: MP3 or PCM/WAV
High-quality audio: 24kHz sample rate, 16-bit mono

Request Body:

text: Input text to convert to speech - required
language: Language code (default: "auto") - "auto" for automatic detection, or ISO 639-1 codes: en, zh, hi, es, ar, bn, pt, ru, ja, pa, de, ko, fr, tr, it, th, pl, nl, id, vi, ur
format: Audio format (default: "mp3") - "mp3" for compressed MPEG audio (96 kbps) or "pcm" for uncompressed WAV

Response:

Streams audio chunks in real-time
Returns X-Language-Code header with detected/used language
Content-Type: audio/mpeg (MP3) or audio/wav (PCM)

Use Cases:

Real-time applications requiring immediate audio playback
Interactive voice responses
Low-latency text-to-speech scenarios

Example Usage

typescript

import { SDK } from "@meetkai/mka1";

const sdk = new SDK({
  serverURL: "https://api.example.com",
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const result = await sdk.llm.speech.speakStreaming({
    text: "<value>",
  });

  console.log(result);
}

run();

Standalone function

The standalone function version of this method:

typescript

import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechSpeakStreaming } from "@meetkai/mka1/funcs/llmSpeechSpeakStreaming.js";

// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
  serverURL: "https://api.example.com",
  bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const res = await llmSpeechSpeakStreaming(sdk, {
    text: "<value>",
  });
  if (res.ok) {
    const { value: result } = res;
    console.log(result);
  } else {
    console.log("llmSpeechSpeakStreaming failed:", res.error);
  }
}

run();

React hooks and utilities

This method can be used in React components through the following hooks and associated utilities.

Check out this guide for information about each of the utilities below and how to get started using React hooks.

tsx

import {
  // Mutation hook for triggering the API call.
  useLlmSpeechSpeakStreamingMutation
} from "@meetkai/mka1/react-query/llmSpeechSpeakStreaming.js";

Parameters

Parameter	Type	Required	Description
`request`	components.TextToSpeechStreamingRequest	✔️	The request object to use for the request.
`options`	RequestOptions	➖	Used to set various options for making HTTP requests.
`options.fetchOptions`	RequestInit	➖	Options that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All `Request` options, except `method` and `body`, are allowed.
`options.retries`	RetryConfig	➖	Enables retrying HTTP requests under certain failure conditions.

Response

Promise<operations.TextToSpeechStreamingResponse>

Errors

Error Type	Status Code	Content Type
errors.APIError	4XX, 5XX	/

Llm.Speech ​

Overview ​

Available Operations ​

transcribe ​

Example Usage ​

Standalone function ​

React hooks and utilities ​

Parameters ​

Response ​

Errors ​

speak ​

Example Usage ​

Standalone function ​

React hooks and utilities ​

Parameters ​

Response ​

Errors ​

speakStreaming ​

Example Usage ​

Standalone function ​

React hooks and utilities ​

Parameters ​

Response ​

Errors ​

Llm.Speech

Overview

Available Operations

transcribe

Example Usage

Standalone function

React hooks and utilities

Parameters

Response

Errors

speak

Example Usage

Standalone function

React hooks and utilities

Parameters

Response

Errors

speakStreaming

Example Usage

Standalone function

React hooks and utilities

Parameters

Response

Errors