Skip to main content
Version: Current

Barge-In Configuration

Overview

Barge-In enables users to naturally interrupt the Voice Agent while it is speaking, creating a more human-like conversational experience. Instead of requiring users to wait for the assistant to finish its response, the agent can detect when the user starts speaking and determine whether the speech is intended as an interruption.

The Barge-In configuration controls how interruptions are detected, validated, and handled. These settings help balance responsiveness with accuracy by reducing false interruptions caused by background noise, microphone echo, brief acknowledgements, or accidental speech.

The Voice Agent supports two interruption modes:

  • Hard Barge-In – Immediately stops assistant playback when a valid interruption is detected or when predefined interruption phrases are recognized.
  • Soft Barge-In – Temporarily pauses assistant playback while the system evaluates whether the user genuinely intends to interrupt. Playback resumes automatically if the interruption is not confirmed.

The configuration allows administrators to control:

  • How quickly user speech is recognized as an interruption.
  • How much speech evidence is required before an interruption is confirmed.
  • How long the system waits before resuming playback during soft interruptions.
  • Protection mechanisms that prevent interruptions caused by echo or assistant audio leakage.
  • Recognition of common interruption phrases such as "stop" or "wait".
  • Filtering of acknowledgement phrases such as "yeah" or "okay" that should not interrupt the conversation.

Proper tuning of these settings helps ensure that the Voice Agent remains responsive to genuine user interruptions while minimizing unintended interruptions that can negatively impact the conversation experience.


Assistant Protection

These settings help prevent accidental interruptions caused by audio leakage, echo, or double-talk.

OptionValueDescription
Assistant Playback Grace Period1000 msTime immediately after the assistant starts speaking during which brief echo or audio leakage from the assistant is ignored to prevent accidental interruption detection.
Assistant Post Interrupt Backoff0 msAdditional delay before assistant speech resumes after a confirmed interruption. Useful for reducing double-talk. A value of 0 disables the delay.

Speech Detection

These settings determine how user speech is detected and classified as a potential interruption.

OptionValueDescription
Server Energy VAD Debounce100 msMinimum interval between server-side Voice Activity Detection (VAD) interrupt evaluations to reduce processing noise and event flooding.
Barge-In Speech Start Duration0.45 secondsMinimum continuous speech duration required before user speech is considered a valid interruption of the assistant.
Barge-In Speech Stop Duration0.2 secondsMinimum duration of silence required before the interruption is considered complete.

Client Controls

These settings regulate client-side speech events and buffering during interruption detection.

OptionValueDescription
Client User Speaking Debounce100 msMinimum interval between user-speaking notifications received from the client browser to prevent excessive start/stop events.
Soft Hold Client Buffer Maximum2000 msMaximum amount of client audio buffered while evaluating a potential soft barge-in.

Soft Barge-In Behavior

Soft barge-in allows the system to temporarily pause assistant playback while determining whether the user intends to interrupt.

OptionValueDescription
Soft Barge-In Probe Timeout600 msMaximum time allowed for a soft barge-in probe before a decision is made to resume or interrupt the assistant.
Soft Barge-In Post Quiet Tail2000 msAfter VAD detects silence during a soft barge-in probe, waits this duration before resuming assistant speech. A value of 0 resumes immediately after silence is detected.

Interruption Confirmation

These settings define the minimum speech recognition evidence required before a detected interruption is confirmed.

OptionValueDescription
Barge-In Confirm Minimum Words1Minimum number of recognized words required to confirm an interruption.
Barge-In Confirm Minimum Characters0Minimum number of recognized characters required to confirm an interruption. A value of 0 disables this threshold.
Barge-In Confirm Minimum Partials1Minimum number of partial speech recognition results required before confirming an interruption.

Interruption Phrases

These phrases trigger an immediate interruption when detected while the assistant is speaking.

OptionValueDescription
Barge-In Interruption Phrasesstop, wait, hold on, hang on, excuse me, never mind, cancel that, slow downWhen any of these phrases are detected while the assistant is speaking, an immediate hard interruption is triggered. Matching is case-insensitive.

Acknowledgement Phrases

These phrases are treated as passive acknowledgements and do not interrupt the conversation flow.

OptionValueDescription
Barge-In Acknowledgement Phrasesuh huh, uh-huh, mm hmm, mm-hmm, mhm, yeah, yep, okay, i see, sure, got it, alrightShort acknowledgement phrases that are ignored while the assistant is speaking or processing. These do not generate chat messages or language model requests.

Notes

  • Hard Barge-In immediately stops assistant playback and gives control to the user.
  • Soft Barge-In temporarily pauses assistant playback while the system determines whether the user genuinely intends to interrupt.
  • Voice Activity Detection (VAD) is used to distinguish actual speech from background noise and brief audio artifacts.
  • Phrase-based interruption provides a fast path for common commands such as "stop" or "wait" without requiring full interruption confirmation logic.