VAD not emitting UserStoppedSpeaking Frame, causing the bot to stuck and not respond #1493

sphatate · 2025-04-01T08:31:29Z

We are using VAD -> STT -> LLM -> TTS architecture.

Lot of time we have observer that bot don't respond, after debugging found that VAD is not emitting UserStoppedSpeaking Frame.

Further debugging made me realize that the issue is caused because

if (
self._vad_state == VADState.STOPPING
and self._vad_stopping_count >= self._vad_stop_frames
):
self._vad_state = VADState.QUIET
self._vad_stopping_count = 0

In above condition _vad_stopping_count is always less than _vad_stop_frames, and i am not understanding the reason as to why is this happening. We have kept the stop_secs to 0.8, so

_vad_stop_frames value is ~35
and _vad_stopping_count always stop increment at range 10 to13 (i.e it stops incrementing above 13) due to which the condition is never satisfied.

markbackman · 2025-04-03T01:19:04Z

Two questions:

What are your VAD settings?
What version of Pipecat?

In 0.0.57, we added handling for the case where the VAD doesn't fire but a TranscriptionFrame is received. This will result in a completion occurs.

In my experience, this works robustly.

sphatate · 2025-04-03T03:31:38Z

Hi @markbackman

these are our vad settings

confidence = 0.5
start_secs = 0.2
stop_secs = 0.8
volume=0.5

We are using pipecat version 0.0.60

sphatate · 2025-04-03T03:32:17Z

Is it something to do with deepgram or azure transcriber, we are facing this issue with both.

Also this is not happening with phone calls, rather this is happening with WebSocket when we do web-calls

markbackman · 2025-04-03T11:59:07Z

This is not a known issue. It sounds like something in your pipeline might be blocking the UserStoppedSpeakingFrame. Few questions:

Have you customized any parts of Pipecat?
Do you have any wrappers around the services (STT, LLM, or TTS)
Do you have any custom processors in your pipeline?

If yes to any of those, please make sure you're pushing frames down the pipeline in all cases.

markbackman · 2025-04-12T01:48:12Z

@sphatate any update? Otherwise, I'll close the issue.

sphatate · 2025-04-12T02:08:35Z

I have not customized anything, this is only happening in wesocket when using for web calls. We are not facing this issue with phone call on twilio

markbackman · 2025-04-12T02:39:25Z

Do you have a single file repro that you can share? Also with what to look for?

sphatate changed the title ~~VAD not emitting UserStoppedSpeaking Frame~~ VAD not emitting UserStoppedSpeaking Frame, causing the bot to stuck and not respond Apr 1, 2025

markbackman self-assigned this Apr 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VAD not emitting UserStoppedSpeaking Frame, causing the bot to stuck and not respond #1493

VAD not emitting UserStoppedSpeaking Frame, causing the bot to stuck and not respond #1493

sphatate commented Apr 1, 2025

markbackman commented Apr 3, 2025

sphatate commented Apr 3, 2025

sphatate commented Apr 3, 2025 •

edited

Loading

markbackman commented Apr 3, 2025

markbackman commented Apr 12, 2025

sphatate commented Apr 12, 2025

markbackman commented Apr 12, 2025

VAD not emitting UserStoppedSpeaking Frame, causing the bot to stuck and not respond #1493

VAD not emitting UserStoppedSpeaking Frame, causing the bot to stuck and not respond #1493

Comments

sphatate commented Apr 1, 2025

markbackman commented Apr 3, 2025

sphatate commented Apr 3, 2025

sphatate commented Apr 3, 2025 • edited Loading

markbackman commented Apr 3, 2025

markbackman commented Apr 12, 2025

sphatate commented Apr 12, 2025

markbackman commented Apr 12, 2025

sphatate commented Apr 3, 2025 •

edited

Loading