Support for Horizontal Scaling with Mediasoup Server #21

gabrielmatau79 · 2024-11-14T17:23:27Z

Problem Statement

Mediasoup is architected to efficiently handle real-time media communications within a single server instance, utilizing workers that correspond to CPU cores. Each Worker operates as a separate C++ subprocess on a single CPU core and can manage multiple Routers, which facilitate media exchange in virtual rooms. While this design is effective for single-server deployments, it presents challenges when scaling horizontally across multiple servers. Currently, there is no built-in mechanism for synchronizing state across multiple Mediasoup instances, limiting the ability to distribute load and manage rooms seamlessly in a multi-server environment.

Mediasoup is designed to leverage CPU cores efficiently by creating a Worker for each core. Each Worker runs on a single core and is capable of handling multiple Routers, which manage media streams in rooms. Below are some key insights into the capacity and consumption characteristics:

Worker Capacity:

Each Worker can typically handle up to 500 consumers (streams being received) under optimal conditions, but this capacity is highly dependent on CPU performance, media quality (resolution and bitrate), and the number of producers and consumers.
Workers are CPU-intensive, particularly when handling high-quality video streams. Adding more consumers or increasing media quality (e.g., 1080p or 4K) increases CPU usage, potentially limiting the total number of consumers a Worker can handle efficiently.

Router Capacity:

Routers within each Worker manage media distribution within a room. They handle producers (sending media) and consumers (receiving media), allowing multiple peers to interact in real-time.
To achieve horizontal scalability, pipeToRouter() enables one Router to forward streams to another Router, even across different servers. This approach helps distribute media processing across multiple Routers, balancing the load effectively.
Each Router’s load is determined by the number of consumers and the quality of streams. High-quality video streams and a large number of consumers can quickly increase the CPU load.

Example of Load in a Room:

In a room with 4 participants, each sending video and audio, there are 4 producers and 12 consumers (each participant receiving streams from the other 3). This configuration already imposes a significant CPU load on the Worker handling the room.

see the following diagram:

Proposed Solution

To enable horizontal scaling in Mediasoup, we propose the following enhancements:

Inter-Router Communication via pipeToRouter() API:

Utilize the existing router.pipeToRouter() API to interconnect different Mediasoup Routers, allowing media streams to be forwarded between Routers on the same or different servers. This approach facilitates the distribution of media streams across multiple Routers, enhancing scalability.

Load Balancing APP for Multiple Mediasoup Servers with Kubernetes Ingress and Ports

This option proposes the development of a load balancing application to efficiently manage multiple Mediasoup server deployments across one or multiple Kubernetes clusters. The objective is to centralize room creation requests and worker monitoring while ensuring scalability and efficient resource utilization.

Proposed Solution:

Load Balancing Application:

Worker Registration and Status Updates:

Create endpoints to register Mediasoup server workers and update their status.
Mediasoup servers will notify the load balancer of their status using webhook calls.

Centralized Room Management:

Develop a /rooms/{roomId} endpoint in the load balancer.
This endpoint will centralize room creation requests and assign them to the server with the best availability.
The endpoint will return the connection object currently provided by the Mediasoup application.

Required Modifications to Mediasoup Server Application:

Server Worker Registration:

Add functionality for Mediasoup servers to register themselves with the load balancer using a POST request.
Include server-specific information such as ingress host and connection port (default: 443).

Server Availability Notifications:

Implement webhooks to notify the load balancer about worker availability during room creation and deletion.
These notifications will provide real-time status updates to the load balancer, enabling better room assignment decisions.

Deployment Considerations in Kubernetes:

TURN Server Association:

Each Mediasoup server deployment must include a dedicated TURN server.
Ensure specific configurations for each TURN server to avoid conflicts.

Unique Port Configurations:

For deployments within the same Kubernetes cluster, configure distinct UDP ports for each TURN server to handle incoming traffic.

Independent Ingress and Services:

Ensure each Mediasoup server has independent ingress, services, and configurations.

Diagram:

lotharking · 2024-11-18T15:28:20Z

IMHO, I think the options based on pipeToRouter() or Redis sound quite good, especially Redis, as we have more knowledge about it. However, based on the documentation, it seems that pipeToRouter() is more widely accepted. While the Redis option sounds great for messaging, I’m unsure how well it would perform in the flow of a WebRTC process.
The last option seems to offer very little scalability, as it would present the challenge of always needing to know the state of each implementation on a given port. I don’t feel it would be the most optimal solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Horizontal Scaling with Mediasoup Server #21

Support for Horizontal Scaling with Mediasoup Server #21

gabrielmatau79 commented Nov 14, 2024 •

edited

Loading

lotharking commented Nov 18, 2024 •

edited

Loading

Support for Horizontal Scaling with Mediasoup Server #21

Support for Horizontal Scaling with Mediasoup Server #21

Comments

gabrielmatau79 commented Nov 14, 2024 • edited Loading

Proposed Solution:

Diagram:

lotharking commented Nov 18, 2024 • edited Loading

gabrielmatau79 commented Nov 14, 2024 •

edited

Loading

lotharking commented Nov 18, 2024 •

edited

Loading