Voice API
Text-to-Speech · Speech-to-Text
Native-level speech synthesis and real-time streaming speech recognition.
The core technology powering AI Agents, chatbots, education, conversation, and more.
Why Core Technology?
Voice API is not just voice conversion. It is the foundational infrastructure that determines the user experience of AI services.
Service Infrastructure
AI Agents, chatbots, conversation training, educational content, browser extensions — every service that needs voice runs on this API. API quality equals service quality.
Real-time Streaming Required
In AI 1:1 conversation, if response latency exceeds 1 second, the dialogue becomes unnatural. We target <500ms latency with WebSocket-based streaming.
Internal + External API
Beyond internal service integration, this is an independent technology asset that can be monetized by providing APIs to external clients.
Service Architecture
talking.how
AI Conversation
native.how
TTS B2C
AI Agent
Voice Agent
loa.bot etc.
Chatbot / Education
API Call
native.how / API
REST API + WebSocket Streaming
Wrapping
Google Cloud TTS / STT API
Seoul Region (Minimal Latency)
TTS Technical Specs
Neural TTS
Based on WaveNet / Neural2. Naturally reproduces human intonation, emotion, and rhythm.
TTS Streaming
Real-time chunk-based delivery. Even long texts start playing immediately, minimizing wait time.
100+ Languages
Korean, English, Japanese, Chinese, and 100+ languages. Various voice styles per language.
Voice Customization
Speed, pitch, volume control. SSML support for emphasis, pauses, and precise pronunciation control.
Multiple Input Formats
Automatic parsing of text, PDF, and webpage URLs. SSML markup also supported.
Multiple Output Formats
MP3, WAV, OGG, FLAC, and more. Configurable bitrate and sample rate.
STT Technical Specs
Real-time STT Streaming
WebSocket-based real-time speech recognition. Text appears instantly as you speak.
Interim Results
Real-time delivery of intermediate recognition results. Response preparation can begin before the user finishes speaking.
AI Post-processing
Automatic punctuation, word correction, and speaker diarization support.
VAD (Voice Activity)
Automatic voice segment detection. Maximizes efficiency by reducing unnecessary processing during silent periods.
Confidence Score
Confidence score provided for each recognition result. Enables re-confirmation logic for low confidence segments.
Context Hints
Pre-specify domain terminology and proper nouns to improve recognition accuracy.
Real-time Voice Pipeline
The core of real-time voice interaction for AI conversation, voice agents, and more. Targeting total pipeline latency < 1 second.
User Voice
Mic Input
STT Stream
Real-time Recognition
LLM Processing
Response Generation
TTS Stream
Speech Synthesis
Speaker Output
AI Response
REST API + WebSocket
/api/v1/tts
/api/v1/tts/stream
/api/v1/stt
/api/v1/stt/stream
/api/v1/voices
/api/v1/languages
Core Technology Applied Where It Matters
Voice API is not limited to specific services. It is universally applied wherever voice capabilities are needed across various services.
native.how
TTS + STT API / B2CA B2C service that reads text, PDFs, and webpages naturally like a native speaker, as well as the voice API infrastructure consumed by all services.
Visit →talking.how
Real-time Streaming ConversationFull pipeline from STT streaming → LLM → TTS streaming. Voice API streaming performance determines conversation quality.
Visit →AI Agent
Voice-based agent interaction. Building AI agents that communicate with users through voice.
loa.bot
Messenger bot voice messages. Convert text responses to voice using TTS.
Education Content
Convert learning materials to voice. Auto-generate native speaker audio for textbooks, vocabulary, and exercises.
Browser Extension
Web page TTS reading. Read translated text with native pronunciation.
AI Patent
Patent document voice review. Listen to long specifications as audio while reviewing.
Custom Development
Custom voice integration for clients. Connect TTS/STT to client services via API.
Want to learn more about Voice API?
We provide consultation on API integration, custom development, and technology partnerships.
Contact Us