Releases: livekit/agents
[email protected]
Patch Changes
-
use rtc.combine_audio_frames - #841 (@theomonnom)
-
Fix agent state to not change to listening when user speaks - #857 (@martin-purplefish)
Fixed canceling uncancelable speech
Fixed bug where agent would get stuck with uninterruptable speech. -
Fix bug where empty audio would cause agent to get stuck. - #836 (@martin-purplefish)
-
fix: handle when STT does not return any speech - #854 (@davidzhao)
-
Fix watcher reloaded processes double connecting to rooms - #822 (@keepingitneil)
-
voice-pipeline: avoid stacked replies when interruptions is disallowed - #869 (@theomonnom)
-
disable preemptive_synthesis by default - #867 (@theomonnom)
-
Fixed bug where agent would get stuck on non-interruptable speech - #850 (@martin-purplefish)
-
use EventEmitter from rtc - #879 (@theomonnom)
-
AudioByteStream: avoid empty frames on flush - #840 (@theomonnom)
-
improve worker logs - #878 (@theomonnom)
-
voice-pipeline: fix tts_forwarder not always being closed - #871 (@theomonnom)
-
bump livekit-rtc to v0.17.5 - #880 (@theomonnom)
-
Fixed bug where agent would freeze if before_llm_cb returned false - #865 (@martin-purplefish)
[email protected]
Patch Changes
- oai-realtime: fix function calls - #826 (@KillianLucas)
[email protected]
Patch Changes
- Fix CI x LFS issue for silero plugin - #818 (@keepingitneil)
[email protected]
Minor Changes
- silero: support any sample rate - #805 (@theomonnom)
Patch Changes
- silero: add prefix_padding_duration #801 - #805 (@theomonnom)
[email protected]
Patch Changes
- oai-realtime: log response errors - #819 (@theomonnom)
[email protected]
Minor Changes
- OpenAI Realtime API support - #814 (@theomonnom)
Patch Changes
- Add Telnyx integration for LLM - #803 (@jamestwhedbee)
[email protected]
✨ [NEW] OpenAI Realtime API support
We're partnering with OpenAI on a new MultimodalAgent
API in the Agents framework. This class completely wraps OpenAI’s Realtime API, abstract away the raw wire protocol, and provide an ultra-low latency WebRTC transport between GPT-4o and your users’ devices. This same stack powers Advanced Voice in the ChatGPT app.
- Try the Realtime API in our playground [code]
- Check out our guide to building your first app with this new API
Patch Changes
-
bump livekit to v0.17.2 - #815 (@theomonnom)
-
silero: support any sample rate - #805 (@theomonnom)
[email protected]
Patch Changes
- Fix function for OpenAI Assistants - #784 (@keepingitneil)
[email protected]
0.9.0 and 0.9.1 packs significant improvements to the reliability and performance of VoiceAssistant.
The main changes are:
- Rework of audio publishing/buffering to reduce glitches caused by Python's asyncio scheduler
- Limiting number of prewarm workers that are spawned
- Introducing
lk.agent.state
attribute to allow client to detect state of VoiceAssistant:speaking
,listening
, etc- Works out of the box with JS Components 2.6.0. See example frontend
- Fixed a rare case where Agent wouldn't reconnect to room after being disconnected
- Additional control for VoiceAssistant
min_endpointing_delay
: control over how quickly the agent should respond when the user pauses (longer delay reduces agent interruptions)before_llm_cb
: ability to control what is being sent to the LLMbefore_tts_cb
: ability to modify text before it's sent to TTS
Detailed Changelog
Patch Changes
-
fix VoiceAssisstant being stuck when interrupting before user speech is committed - #790 (@coderlxn)
-
Fix function for OpenAI Assistants - #784 (@keepingitneil)
[email protected]
Patch Changes
-
avoid returning tiny frames from TTS - #747 (@theomonnom)
-
Fixing Assistant API Vision Capabilities - #771 (@keepingitneil)