Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add AVSynchronizer #1168

Closed
wants to merge 11 commits into from
Closed

feat: add AVSynchronizer #1168

wants to merge 11 commits into from

Conversation

longcw
Copy link
Collaborator

@longcw longcw commented Dec 3, 2024

No description provided.

Copy link

changeset-bot bot commented Dec 3, 2024

⚠️ No Changeset found

Latest commit: dcb5190

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

livekit-agents/livekit/agents/utils/av_sync.py Outdated Show resolved Hide resolved
)

async for video_frame, audio_frame in video_generator:
av_sync.push(video_frame)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
av_sync.push(video_frame)
await av_sync.push(video_frame)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for docs, maybe also clarify that it could be pushed from different tasks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll create an example to test and demonstrate the use case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example is awesome! it'll be super useful. It might be useful to include an example file too.. I'll see about moving some of these examples to its own organized repo in livekit-examples org

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or I was thinking of another example of reading video and audio from the client and sending it back, or an audio wave with a graph so we can easily see the synchronization while not needing an additional mp4 file.

livekit-agents/livekit/agents/utils/av_sync.py Outdated Show resolved Hide resolved
)
self._capture_video_task = asyncio.create_task(self._capture_video())

async def push(self, frame: Union[rtc.VideoFrame, rtc.AudioFrame]) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this wait to have at least one AudioFrame and one VideoFrame to start the capture? So we synchronize the start of the stream as well

Copy link
Collaborator Author

@longcw longcw Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no, the agent can stream only video frames if there is no audio at the beginning. The user only need to make sure when there is a audio for playing, the first audio frame is pushed at the same time with the corresponding video frame.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make the queue_size smaller enough for that case, e.g. 100ms or even less.


async def push(self, frame: Union[rtc.VideoFrame, rtc.AudioFrame]) -> None:
if isinstance(frame, rtc.AudioFrame):
# TODO: test if frame duration is too long
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we need to keep track of the pushed audio.

@theomonnom
Copy link
Member

Any idea on how we could handle interruptions with it?

@longcw
Copy link
Collaborator Author

longcw commented Dec 5, 2024

Any idea on how we could handle interruptions with it?

First if queue_size_ms is small, the whole process is synced and the pushing speed is controlled by the playing. so for interruption we just need to stop the audio pushing like the current pipeline agent.

I can also add a clear_queue method for the AVSync in case the queue size is big as we discussed.

@davidzhao
Copy link
Member

@longcw I'm also wondering, perhaps it'd be better for the core sync utils to go into our Python SDK? https://github.com/livekit/python-sdks/

since it doesn't require agents here.

@longcw
Copy link
Collaborator Author

longcw commented Dec 6, 2024

@longcw I'm also wondering, perhaps it'd be better for the core sync utils to go into our Python SDK? https://github.com/livekit/python-sdks/

since it doesn't require agents here.

I agree. I'll finish this with some more testing and documents, then maybe create a pr for the sdk?

@longcw longcw changed the title [draft] feat: add AVSynchronizer feat: add AVSynchronizer Dec 7, 2024
@longcw
Copy link
Collaborator Author

longcw commented Dec 11, 2024

@davidzhao The PR for the python sdk livekit/python-sdks#324. I think we can close this one after that.

@davidzhao davidzhao closed this Dec 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants