Speech Recognition (ASR)¶

Transcribe spoken audio to text using transcribe() or atranscribe().

Basic usage¶

from khaya import KhayaClient

with KhayaClient(api_key) as khaya:
    result = khaya.transcribe("recording.wav", "tw")
    print(result.text)  # "me ho yɛ"

The second argument is the language code of the spoken language in the audio.

transcribe() returns a TranscriptionResult with:

Attribute	Type	Description
`text`	`str`	The transcribed string
`language`	`str`	Language code of the audio (e.g. `"tw"`)

Supported languages¶

Code	Language
`ada`	Adangme
`en_gh`	African English
`atw`	Akuapem Twi
`tw`	Asante Twi
`dga`	Dagaare
`dag`	Dagbani
`ee`	Ewe
`fat`	Fante
`fra`	French
`gaa`	Ga
`gon`	Gonja
`gur`	Gurene
`ha`	Hausa
`ig`	Igbo
`kas`	Kasem
`ki`	Kikuyu
`kon_k`	Konkomba (Likoonli)
`kon_l`	Konkomba (Likpakpaanl)
`kri`	Krio
`kus`	Kusaal
`luo`	Luo
`mam`	Mampruli
`men`	Mende
`mer`	Meru/Kimeru
`nzi`	Nzema
`pid`	Pidgin
`sn`	Shona
`sw`	Swahili
`tem`	Temne
`wal`	Wali
`wo`	Wolof
`yo`	Yoruba

Audio requirements¶

Format: WAV (.wav)
Encoding: PCM (uncompressed)
Sample rate: 16 kHz recommended
Channels: Mono

Convert to the correct format with ffmpeg if needed:

ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav

Saving the transcript¶

with KhayaClient(api_key) as khaya:
    result = khaya.transcribe("speech.wav", "tw")
    with open("transcript.txt", "w") as f:
        f.write(result.text)

Error handling¶

from khaya.exceptions import ASRTranscriptionError, AuthenticationError, APIError

try:
    result = khaya.transcribe("speech.wav", "tw")
except ASRTranscriptionError as e:
    # Raised when the file is not found or input is invalid
    print(f"Transcription error: {e.message}")
except AuthenticationError:
    print("Check your API key.")
except APIError as e:
    print(f"API error {e.status_code}: {e.message}")

See Error Handling for the full exception reference.