Assistant CoreAssistant Core
Voice

Auto vs Realtime Voice

Compare the two voice modes and choose the right one for your use case.

Comparison

FeatureAuto ModeRealtime Mode
InteractionTurn-based (speak → wait → response)Full-duplex, natural conversation
LatencyModerate (processes in stages)Low (instant speech-to-speech)
AI ModelsAny LLM (GPT-4o, Gemini, Claude, etc.)Gemini Live or xAI Grok only
Voice QualityVery good (dedicated TTS)Excellent (native audio)
Tool SupportFull (web search, KB, etc.)Limited (provider-specific)
Plan RequiredFree+ (5 min/day)Plus+ (15 min/day)

When to Use Auto

  • Customer support — Structured Q&A with knowledge base
  • Any LLM — Works with GPT-4o, Claude, Gemini, etc.
  • Tool usage — Needs web search, image generation, etc.

When to Use Realtime

  • Natural conversations — Back-and-forth dialogue
  • Low latency — Instant responses feel more natural
  • Simple queries — Don't need complex tool integrations

Voice Quota

PlanAuto VoiceRealtime Voice
Free5 min/day
Plus30 min/day15 min/day
Pro120 min/day60 min/day
EnterpriseUnlimitedUnlimited

On this page