Skip to content

Modular pipeline for processing data streams and measuring each step

License

Notifications You must be signed in to change notification settings

bigbluebutton-bot/stream_pipeline

Repository files navigation

Stream Pipeline

Overview

stream_pipeline is a modular pipeline designed to handle data streams, for eaxample audio streams. This project aims to provide a robust and flexible framework for processing streaming data through a series of modular and configurable components. Each step will be measured and can be used to optimize the pipeline. For a detailed description of the architecture, please refer to the docs

How to use this package

  1. Create a new file called requirements.txt in the root of your project and add the following line:
git+https://github.com/bigbluebutton-bot/stream_pipeline
mypy
  1. Install the package by running the following command:
pip3 install -r requirements.txt
  1. Example: Create three files called main.py, server_external_module.py and data.py in the root of your project and add this example from this repository to the files.
wget https://raw.githubusercontent.com/bigbluebutton-bot/stream_pipeline/main/server_external_module.py
wget https://raw.githubusercontent.com/bigbluebutton-bot/stream_pipeline/main/main.py
wget https://raw.githubusercontent.com/bigbluebutton-bot/stream_pipeline/main/data.py
  1. Open two terminals and run the following commands to start the pipeline:
python3 server_external_module.py
python3 main.py

Architecture

The pipeline is designed to be modular and flexible. Each module can be replaced with a custom implementation. For a deatiled description of the architecture, please refer to the docs

Dev

git clone https://github.com/bigbluebutton-bot/stream_pipeline
cd stream_pipeline

Install dependencies to work

pip3 install -r requirements.txt
pip3 install grpcio-tools mypy-protobuf mypy

Check for type errors

pip3 install mypy
mypy --check-untyped-defs --disallow-untyped-defs main.py
mypy --check-untyped-defs --disallow-untyped-defs server_external_module.py

After changing something in proto file

Generate proto files:

pip3 install grpcio-tools mypy-protobuf mypy
cd stream_pipeline && python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. --mypy_out=. --mypy_grpc_out=. data.proto && cd ..

There will be an import error: error:

{
    "message": "No module named 'data_pb2'",
    "id": "Error-e22a2f0e-bde3-4ae4-ac34-d77388b17a9e",
    "type": "ModuleNotFoundError",
    "traceback": [
        "Traceback (most recent call last):",
        "/home/user/stream_pipeline/main.py:171",
        "/home/user/stream_pipeline/main.py:11",
        "/home/user/stream_pipeline/stream_pipeline/module_classes.py:12",
        "/home/user/stream_pipeline/stream_pipeline/data_pb2_grpc.py:6",
        "ModuleNotFoundError: No module named 'data_pb2'"
    ],
    "thread": "MainThread",
    "start_context": "N/A",
    "thread_id": 140072941379584,
    "is_daemon": false
}

Fix: Change import data_pb2 as data__pb2 to from . import data_pb2 as data__pb2 in stream_pipeline/data_pb2_grpc.py:6

License

MIT License

About

Modular pipeline for processing data streams and measuring each step

Resources

License

Stars

Watchers

Forks

Languages