Speech AI Platform
Pronunciation scoring, STT & TTS in 150MB — for AI gents
Speech AI Platform – Compact, integrated pronunciation scoring, STT, and TTS
Summary: Speech AI Platform offers three APIs—pronunciation assessment, speech-to-text, and text-to-speech—integrated into a single system with a shared 17MB model for speech understanding and a Kokoro-82M TTS model. It supports AI agents via an MCP server and REST API, enabling unified speech processing in one integration.
What it does
The platform provides phoneme-level pronunciation scoring, word-level transcription with timestamps and confidence, and 12 English TTS voices using Kokoro-82M. All features run on lightweight models and are accessible through a single API key or MCP server with eight tools.
Who it's for
It targets language learning apps and AI tutoring agents requiring combined pronunciation scoring, transcription, and speech synthesis from one provider.
Why it matters
It solves the challenge of accessing consistent, low-latency speech processing tools in a unified API, simplifying integration for AI agents and language applications.