From 1c2942986128ca6db0de3cb1ef4655df5be83f48 Mon Sep 17 00:00:00 2001 From: Victor Gaydov Date: Tue, 25 Jun 2024 15:48:26 +0400 Subject: [PATCH] [spo] draft-731: Update documentation to recent pipeline changes - describe packet read modes (fetch, peek) and status codes - describe new design of frames (pools, partial reads) - describe frame read modes (hard, soft) and codes - describe frame fields and formats --- docs/sphinx/development/coding_guidelines.rst | 72 +--- docs/sphinx/internals/audio_backends.rst | 6 +- docs/sphinx/internals/code_structure.rst | 1 + docs/sphinx/internals/glossary.rst | 6 +- docs/sphinx/internals/packets_frames.rst | 325 +++++++++++++----- docs/sphinx/internals/threads.rst | 6 +- docs/sphinx/internals/timestamps.rst | 36 +- src/internal_modules/Doxyfile | 18 +- src/internal_modules/main.dox | 6 +- 9 files changed, 315 insertions(+), 161 deletions(-) diff --git a/docs/sphinx/development/coding_guidelines.rst b/docs/sphinx/development/coding_guidelines.rst index 91301d398..06aeb999f 100644 --- a/docs/sphinx/development/coding_guidelines.rst +++ b/docs/sphinx/development/coding_guidelines.rst @@ -37,9 +37,7 @@ Portability * The code should run on a variety of operating systems, compilers, and hardware architectures, including rather old compilers and distributions. See :doc:`supported platforms ` page. -.. raw:: html - - +\ * The code specific to platform, compiler, or optional features and dependencies, should be isolated inside corresponding :ref:`target directories `. All other code should be portable across all supported configurations. @@ -48,57 +46,39 @@ Best practices * The code should compile without warnings. Use ``--enable-werror`` :doc:`option ` to turn warnings into errors. -.. raw:: html - - +\ * Cover every component with class-level unit tests if possible. Additionally, cover high-level features with pipeline-level integration tests. We use `CppUTest `_. -.. raw:: html - - +\ * Prefer RAII and smart pointers for resource management. -.. raw:: html - - +\ * Prefer either non-copyable or trivial-copy objects. Avoid making "heavy" operations implicit, in particular, operations involving memory management. -.. raw:: html - - +\ * Use ``const`` when it's useful. -.. raw:: html - - +\ * Use anonymous namespaces instead of static globals, functions, and constants. -.. raw:: html - - +\ * Use enums instead of defines, when possible. -.. raw:: html - - +\ * Use arenas and pools for memory management. -.. raw:: html - - +\ * Carefully log (using ``roc_log``) all important events and information needed to understand why an error occurred. -.. raw:: html - - +\ * Panic (using ``roc_panic``) when a contract or an invariant is broken. A panic is always preferred over a crash or undefined behavior. However, remember that panics are only for bugs in Roc itself. Never panic on invalid or unexpected data from the outside world. @@ -107,33 +87,23 @@ Coding style * The code should be formatted using ``scons fmt``, which invokes ``clang-format``. If it goes awry, you can prevent a file from being formatted by adding it to ``.fmtignore``. -.. raw:: html - - +\ * Header and source files should contain the "Roc Streaming authors" copyright and license header. Running ``scons fmt`` will automatically insert them. -.. raw:: html - - +\ * Headers, classes, public members, and free functions should be documented using Doxygen. Use ``--enable-doxygen`` :doc:`option ` to enable warnings about undocumented elements. -.. raw:: html - - +\ * Prefer creating individual .h and .cpp files for every class. Use snake_case for file names and old-style header guards, which are automatically inserted by ``scons fmt``. -.. raw:: html - - +\ * Use upper case SNAKE_CASE for macros, CamelCase for class names, and lower case snake_case for methods, functions, fields, and variables. Add trailing underscore\_ for private methods and fields. -.. raw:: html - - +\ * Members in class should have the following order: @@ -150,9 +120,7 @@ Coding style * methods * fields -.. raw:: html - - +\ * The code should be formatted according to our 1TBS-like indentation style defined in ``.clang-format`` config: @@ -161,12 +129,6 @@ Coding style * use braces even for single-statement blocks; * don't place condition or loop bodies at the same line as the control statement. -.. raw:: html - - +\ * ``#endif`` and ``#else`` statements should have trailing ``// `` and ``// !`` comments. Namespace closing brace should have trailing ``// namespace `` comment. - -.. raw:: html - - diff --git a/docs/sphinx/internals/audio_backends.rst b/docs/sphinx/internals/audio_backends.rst index 9d2d77dc9..4a359aa12 100644 --- a/docs/sphinx/internals/audio_backends.rst +++ b/docs/sphinx/internals/audio_backends.rst @@ -20,7 +20,7 @@ syntax meaning example ``file://-`` or ``file:-`` stdin or stdout ``file:-`` ========================== ========================== ============== -User can specify input file/device (**source**) for ``roc-send`` via ``--input`` option, and output file/device (**sink**) for ``roec-recv`` via ``--output`` option. +User can specify input file/device (**source**) for ``roc-send`` via ``--input`` option, and output file/device (**sink**) for ``roc-recv`` via ``--output`` option. When device is used, user specifies driver explicitly (e.g. ``alsa://`` for ALSA, ``pulse://`` for PulseAudio, etc). When file is used, file driver is selected automatically, usually by file extension. However, user may force usage of specific driver for the file via ``--input-format`` or ``--output-format`` option. @@ -45,7 +45,7 @@ The job of ``roc-send`` and ``roc-recv`` is thus to open a source and a sink and - in ``roc-recv``, ``ISource`` is implemented by receiver pipeline from ``roc_pipeline``, and ``ISink`` is implemented by device or file from ``roc_sndio`` -The task of transferring sound from ``ISource`` to ``ISink`` is implemented in `sndio::Pump `_ class, which works uniformely with any pair of source and sink, being it file, device, or pipeline. +The task of transferring sound from ``ISource`` to ``ISink`` is implemented in `sndio::Pump `_ class, which works uniformly with any pair of source and sink, being it file, device, or pipeline. Backends and drivers ==================== @@ -56,7 +56,7 @@ Every **backend** (`IBackend `_ holds s shared pointer to `core::Buffer `_ plus offset and length of the region inside the buffer. +* **data** - Byte slice (``core::Slice``) that references binary data of the whole packet. `core::Slice `_ holds a shared pointer to `core::Buffer `_ or `core::BufferView `_ plus offset and length of the region inside the buffer. * **headers** - Parsed protocol-specific fields. For example, `UDP `_ struct holds UDP ports, and `RTP `_ struct holds RTP timestamp, seqnum, and other fields. What headers are present in packet depends on packet type. * **flags** - Bitmask that defines what kind of packet is this, what headers are available, and what action were already done with the packet. -Packet lifecycle -================ +Packet headers typically have some kind of sequence number and/or timestamp. Packets may be lost, duplicated, or reordered. Components that work with packets use sequence numbers and timestamps to determine position of each packet in the stream. + +Packet payload may vary significantly depending on the protocol. It may be uncompressed audio, encoded audio chunks, redundancy data used for loss repair, structured control information, etc. + +Packet types +============ + +There are three major types of packets: + +* **source packets** -- Packets with encoded media data. + + Always present. When FEC is enabled, may also contain FEC-specific header or footer. + +* **repair packets** -- Packets with encoded redundancy data. + + Present only when FEC is enabled. Format depends on FEC scheme. Used to restore lost source (media) packets on receiver. + +* **control packets** -- Out-of-band control messages. + + Present unless disabled by user. May be used for session management, congestion control, latency estimation, synchronization, etc. + +Typical examples are RTP for source packets, FECFRAME for repair packets, and RTCP for control packets. See :doc:`/internals/network_protocols` page for the list of supported protocols. + +For further details about FECFRAME and RTCP usage, see :doc:`/internals/fec` and :doc:`/internals/timestamps`. + +Packet life cycle +================= Packet life cycle depends on whether we're inside sender or receiver pipeline. -Typical packet **lifecycle on sender**: +Typical packet **life cycle on sender**: -* Allocate packet from packet pool (abstracted by `packet factory `_). +* Allocate packet and packet buffer from pools (abstracted by `packet factory `_). Attach buffer to packet. -.. raw:: html +\ - +* If there are specific requirements for payload alignment, ask `packet composer `_ to **align buffer**. Composer adjusts packet buffer in a way so that payload inside buffer would have desired alignment. -* Allocate buffer from buffer pool (abstracted by `buffer factory `_) and attach buffer to packet. +\ -.. raw:: html +* Ask `packet composer `_ to **prepare packet**. Composer resizes packet buffer to be able to hold given payload size and all necessary headers. Composer also enables appropriate header structs in packet (e.g. ``RTP`` or ``FEC``) by setting appropriate packet flags. - +\ -* If there are specific requirements for payload alignment, ask `packet composer `_ to **align** packet. Composer adjusts packet's buffer in a way so that payload inside buffer would have desired alignment. +* Pass packet to pipeline of chained `packet writers `_. As packet goes through the pipeline, pipeline components may incrementally **populate packet** with data, i.e. set fields of packet's header structs and encode samples into packet buffer. -.. raw:: html +\ - +* In the end of pipeline, ask `packet composer `_ to **compose packet**. Composer finishes filling of packet buffer by encoding all fields from header structs into corresponding parts of the packet buffer. -* Ask `packet composer `_ to **prepare** packet. Composer resizes packet's buffer to be able to hold given payload size and all necessary headers. Composer also enables appropriate header structs in packet (e.g. ``RTP`` or ``FEC``) by setting appropriate packet flags. +\ -.. raw:: html +* After the packet is composed, **send packet** over network. - +\ -* Pass packet to pipeline of chained `packet writers `_. As packet goes through the pipeline, pipeline components may populate packet with more data, i.e. set fields of packet's header structs and encode samples directly into packet's buffer. +* After packet is sent, return packet and packet buffer to their pools. -.. raw:: html +Typical packet **life cycle on receiver**: - +* Allocate packet and packet buffer from pools (abstracted by `packet factory `_). Attach buffer to packet. -* In the end of pipeline, ask `packet composer `_ to **compose** packet. Composer finishes filling of packet's buffer by encoding all fields from header structs into corresponding parts of the packet's buffer. +\ -.. raw:: html +* **Fill buffer** with data retrieved from network. - +\ -* After the packet is composed, its buffer may be sent over network. +* Ask `packet parser `_ to **parse packet**. Parser enables appropriate header structs in packet (e.g. ``RTP`` or ``FEC``) by setting appropriate packet flags, and fills these structs with information decoded from packet buffer. -.. raw:: html +\ - +* Pass packet to pipeline of chained `packet readers `_. As packet goes through the pipeline, pipeline components may **process packet** and read parsed header fields or decode samples from packet buffer. -* After packet is sent, packet and packet's buffer may be returned to pools. +\ -Typical packet **lifecycle on receiver**: +* After packet is not needed anymore, return packet and packet buffer to their pools. -* Allocate packet from packet pool (abstracted by `packet factory `_). +For further details, see :doc:`/internals/pipelines`. -.. raw:: html +Packet ownership +================ - +Packet and packet buffer are both reference-countable objects. Packet factories, writers, and readers pass packets using shared pointers. Packet itself holds a shared pointer to its buffer. Readers and writers follow simple rules: -* Allocate buffer from buffer pool (abstracted by `buffer factory `_) and attach buffer to packet. +* When packet is written to packet writer, the **right to modify** packet or packet buffer is **passed from caller to writer**. The caller may retain a reference to the packet if needed, but should assume that writer may modify packet immediately or later. -.. raw:: html +\ - +* When packet is fetched (``ModeFetch``) or peeked (``ModePeek``) from packet reader, the **right to modify** packet or packet buffer is **passed from reader to caller**. Reader may retain a reference to the packet if needed, but should assume that the caller may modify packet. -* Fill packet's buffer with data retrieved from network. +Packet read mode (fetch vs peek) +================================ -.. raw:: html +All `packet readers `_ support two reading modes: - +* ``ModeFetch`` -- get next available packet, remove it from queue, and return it +* ``ModePeek`` -- try to return next available packet, but don't remove it from queue -* Ask `packet parser `_ to **parse** packet's buffer. Parser enables appropriate header structs in packet (e.g. ``RTP`` or ``FEC``) by setting appropriate packet flags, and fills these structs with information decoded from packet's buffer. +``ModeFetch`` is the "normal" mode, used when we need to get next packet and move stream forward. -.. raw:: html +``ModePeek`` implements a kind of a look-ahead. It is used when we want to inspect next packet before deciding whether to fetch it from reader. Fetching is an irreversible action, as it moves the read pointer forward, and sometimes we may want to avoid it depending on what packet is next. - +Here is an example when ``ModePeek`` is useful: -* Pass packet to pipeline of chained `packet readers `_. As packet goes through the pipeline, pipeline components may read fields of packet's header structs and decode samples directly from packet's buffer. +* Imagine we're reading packets from FEC reader and there are 10 packets per FEC block. We've read 7th packet in current block, and now it's time to play 8th packet. But, packets 8, 9, 10 were delayed by network and weren't repaired, and 1st packet of the next block already arrived. -.. raw:: html +\ - +* If we perform a regular fetch now (``ModeFetch``), FEC reader would move pointer to next available packet, i.e. 1st packet of the next block. After switching to next block, it looses the possibility to repair 9th and 10th packets from previous block even if more packets arrive by the time they're needed. -* After packet is not needed anymore, packet and packet's buffer may be returned to pools. +\ -For further details, see :doc:`/internals/pipelines`. +* In contrast, if we perform a peek (``ModePeek``) and see that the next available packet is not needed right now, we can skip fetch until next read. We still have to insert gap in place of 8th packet, as it's already time to play it. However, since we haven't switched to the next block, we still have a chance that 9th and 10th packets will arrive or repaired by the time when we need to play them. -Packet parsers and composers -============================ +It's not guaranteed that ``ModePeek`` always can see next packet. Depending on implementation and current state, packet reader may not be able to access next packet without moving stream position forward. In such cases, ``ModeFetch`` would return a packet, but ``ModePeek`` returns ``StatusDrain``. -Packet `parser `_ and `composer `_ are interfaces that have implementations for various protocols, e.g. RTP or FECFRAME. +Packet status codes +=================== -Both parsers and composers can be **chained** to implement stacking of protocols. For example, depending on FEC scheme, FECFRAME may require adding a footer to source packets. When such FEC scheme is used, pipeline will create two chained parsers/composers: the first one for FECFRAME protocol, and the second, nested one, for RTP protocol. +Packet read and write operations return `status codes `_: -The chaining support is based on `slices `_. Packet's data field contains a slice that refers to a part of a buffer. When chaining is employed, the upper parser/composer creates a sub-slice of packet's buffer which corresponds to the nested protocol, and passes that sub-slice to the nested parser/composer. This way parser or composer does not need to be aware of whether it's the upper one or nested one. +* ``StatusOK`` -Slices are also used in composer for payload alignment. Some pipeline components may have specific requirements for payload, for example, OpenFEC codec requires payload to be 8-byte aligned. To achieve this, FEC composer may sub-slice initial packet's buffer to shift its beginning in a way that after adding all headers, payload becomes properly aligned. + Packet was successfully read or written. -Packet types -============ +* ``StatusDrain`` -There are three major types of packets: + Packet reader returns it when there are no packets to read right now (but more can arrive later). When peek mode is used (``ModePeek``), packet reader may also return it when look-ahead is not possible without moving stream position forward (but it may be possible to read packet using fetch mode). -* **source packets** -- Packets with encoded media data. + Packet writer never returns this status. - Always present. When FEC is enabled, may also contain FEC-specific header or footer. +* *other code* -* **repair packets** -- Packets with encoded redundancy data. + Any other status indicates pipeline failure and typically causes session termination. - Present only when FEC is enabled. Format depends on FEC scheme. Used to restore lost source (media) packets on receiver. +.. note:: -* **control packets** -- Out-of-band control messages. + Packet readers and writers never return ``StatusPart``, as it's not possible to read or write a part of a packet. - Present unless disabled by user. May be used for session management, congestion control, latency estimation, synchronization, etc. +Packet parsers and composers +============================ -Typical examples are RTP for source packets, FECFRAME for repair packets, and RTCP for control packets. See :doc:`/internals/network_protocols` page for the list of supported protocols. +Packet `parser `_ and `composer `_ are interfaces that have implementations for various protocols, e.g. RTP or FECFRAME. -For further details about FECFRAME and RTCP usage, see :doc:`/internals/fec` and :doc:`/internals/timestamps`. +Both parsers and composers can be chained to implement stacking of protocols. For example, depending on FEC scheme, FECFRAME may require adding a footer to source packets. When such FEC scheme is used, pipeline will create two chained parsers/composers: the first one for FECFRAME protocol, and the second, nested one, for RTP protocol. + +The chaining support is based on `slices `_. Packet's data field contains a slice that refers to a part of a buffer. When chaining is employed, the upper parser/composer creates a sub-slice of packet buffer which corresponds to the nested protocol, and passes that sub-slice to the nested parser/composer. This way parser or composer does not need to be aware of whether it's the upper one or nested one. + +Slices are also used in composer for payload alignment. Some pipeline components may have specific requirements for payload, for example, OpenFEC codec requires payload to be 8-byte aligned. To achieve this, FEC composer may sub-slice initial packet buffer to shift its beginning in a way that after adding all headers, payload becomes properly aligned. Frames ====== `Frame `_ class from ``roc_audio`` module represents input or output audio frame. -* **samples** - Pointer to audio samples array and its length. +Frame holds the following information: -* **flags** - Bitmask that defines what additional information about the frame. +* **buffer** - Byte slice (``core::Slice``) that references binary data of the frame. `core::Slice `_ holds s shared pointer to `core::Buffer `_ or `core::BufferView `_ plus offset and length of the region inside the buffer. -Unlike packet, a frame does not hold ownership of the samples array (packet holds a shared pointer to buffer). Frames are typically short-living objects allocated on stack and existing only during single pipeline tick. +* **format** - Encoding identifier for the buffer contents. -Frame lifecycle -=============== +* **flags** - Bitmask that defines additional characteristics of the frame, e.g. does it have samples decoded from packets or interpolated because of a packet loss. + +* **duration** - How much samples does the frame contain (per audio channel). + +* **capture timestamp** - Absolute time when the first sample of the frame was captured on sender (see :doc:`/internals/timestamps`). + +Unlike packets, which may be lost or reordered, frames are always arranged into a **continuous stream**. Next frame always holds samples that are following immediately after the previous frame. If corresponding packet was lost, it is replaced with zeroized or interpolated samples, to keep stream continuous. + +Samples in frame's buffer may have different encoding, depending on the format field of the frame. At different stages of the sender or receiver pipeline, frames may have different formats. + +Frame formats +============= + +There are three important categories of frames: + +* **raw frames** - frame uses so-called "raw" format + + Raw format is a native-endian uncompressed PCM with 32-bit floats. Many pipeline elements can work only with frames in raw format (e.g. resampler). If network or sound card uses different format, a conversion is performed in the beginning on in the end of the pipeline. + +* **pcm frames** - frame uses any PCM format + + Such frames still use PCM, but sample size and endian may be arbitrary (e.g. 24-bit big-endian unsigned integers). Some pipeline elements can work with arbitrary PCM frames, e.g. entry point in the beginning or in the end of the pipeline. For example, when a sound card may produce frames in non-raw PCM format, we still can do some simple operations on it, e.g. split frames into several sub-frames. + +* **opaque frames** + + All non-PCM frames are considered opaque. We can't do much with such frames, except doing a verbatim copy or passing to decoder. + +Frame life cycle +================ Frame life cycle depends on whether we're inside sender or receiver pipeline. -Typical frame **lifecycle on sender**: +Typical frame **life cycle on sender**: -* Allocate frame on stack and attach to some samples array. +* Allocate frame and frame buffer from pools (abstracted by `frame factory `_). Attach buffer to frame. -.. raw:: html +\ - +* **Fill buffer** with the samples from user or sound card. -* Pass frame to pipeline of chained `frame writers `_, requesting to process samples from this frame. A writer may pass the same frame further, or may create a new frame based on the provided one. +\ -.. raw:: html +* Pass frame to pipeline of chained `frame writers `_, requesting to **write samples** from this frame. A writer may pass the same frame further, or may create a new frame based on the provided one. - +\ -* Eventually, `packetizer `_ produces packet(s) based on the frame, and the remainder of the pipeline will pass packets. +* Eventually, frame reaches `packetizer `_, which **produces packets** based on the frame, and the remainder of the pipeline will pass packets. -Typical frame **lifecycle on receiver**: +Typical frame **life cycle on receiver**: -* Allocate frame on stack and attach to some samples array. +* Allocate frame from pool (abstracted by `frame factory `_). -.. raw:: html +\ - +* Optionally, allocate frame buffer and attach to the frame. If frame has pre-allocated buffer, frame reader is allowed, but not required, to use this buffer to write samples. If there is no pre-allocated buffer, or frame reader doesn't want to use it, it must allocate buffer by itself and attach it to frame. -* Pass frame to pipeline of chained `frame reader `_, requesting to fill this frame with samples. A reader may pass the same frame further, or may create a new frame, request subsequent reader(s) to fill it, and then fill the provided frame based on that. +\ -.. raw:: html +* Pass frame to pipeline of chained `frame reader `_, requesting to **read samples** into the frame. A reader may pass the same frame further, or may create a new frame, request subsequent reader(s) to fill it, and then fill the provided frame based on that. - +\ -* Eventually, the request reaches `depacketizer `_, which consumes packet(s) from incoming queue and decodes them into the frame. +* Eventually, the request reaches `depacketizer `_, which **consumes packets** from incoming queue and decodes them into the frame. For further details, see :doc:`/internals/pipelines`. + +Frame ownership +=============== + +Frame and frame buffer are both reference-countable objects. Similar to packet, frame holds a shared pointer to its buffer. However, ownership rules for frames are different. + +* When frame is written to frame writer, the **caller retains right to modify frame and buffer**. Frame writer is not allowed to modify frame or its buffer, and should assume that caller may modify them after the writer returns. + +\ + +* When frame is requested from frame reader, again, the **caller retains right to modify frame and buffer**. Frame is allocated by caller and passed to frame reader, which should fill it with the result. Frame buffer may be allocated either by caller or by frame reader. In all cases, the caller owns both frame and buffer, and frame reader should assume that caller may modify them after the reader returns. + +As mentioned above, when reading frame, the buffer may be either pre-allocated by caller, or allocated and returned by frame reader. It gives flexibility in memory management and allows to choose most efficient implementation depending on situation: + +* In some cases, it is beneficial to pre-allocate frame and frame buffer and reuse them each time when we call frame reader. Due to pre-allocated buffer, read operation wouldn't require allocations. + +\ + +* In other cases, it is more beneficial to allow frame reader to allocate buffer by itself. This is useful when frame reader already has its own buffer, and instead of copying data from it to caller's buffer, it can just attach (a slice of) existing buffer to the frame. + +`Frame factory `_ provides convenient method ``reallocate_frame()`` that implements the first approach, suitable for most frame readers. It checks if the frame already has pre-allocated buffer large enough to fit the result. If not, it automatically allocates a new buffer and attaches it to the frame. After calling this call, frame is guaranteed to have a suitable buffer, no matter if it was pre-allocated or not. + +Frame read mode (hard vs soft) +============================== + +All `frame readers `_ support two reading modes: + +* ``ModeHard`` -- read as much samples as possible, fill gaps caused by packet losses with zeros or interpolation +* ``ModeSoft`` -- read until next gap, but no further + +``ModeHard`` is the "normal" mode, which is used to read frames when it's time to play them. + +``ModeSoft`` is a mechanism for prefetching frames that are not needed right now, but will be needed soon. If next packets already arrived, it works the same as ``ModeHard``. However, if next packets are missing, it stops reading and doesn't move the stream forward. + +Prefetching helps to counter occasional processing, scheduling, and I/O jitter without increasing latency: + +* If we have time until next frame, we can perform a soft read to try decoding next frame in before, to reduce probability of an underrun. + +\ + +* If soft read encounters a gap caused by packet loss, it stops. Hard read would instead fill the gap with zeros or interpolation. We don't want it because these packets still have a chance to arrive by the time when they should be played, and we shouldn't consider them lost until that. + +When ``ModeSoft`` stops early, it returns either ``StatusPart`` (if it have read some samples before the gap), or ``StatusDrain`` (if it haven't read any samples at all). Note that ``StatusPart`` may be also caused by other reasons (see below). + +Frame partial reads +=================== + +To read a frame, the caller provides `frame reader `_ with a frame and requested duration. Frame reader is allowed to return smaller duration than requested. This is called "partial read" and is indicated by ``StatusPart`` code. + +Partial reads are used widely for various purposes: + +* To fit the result into the maximum buffer size that can be allocated from frame buffer pool. + +\ + +* To truncate the result by some internal chunk or packet boundary. This allows to simplify implementations of many readers, as they don't need to implement a loop that concatenates internal chunks into a single frame. + +\ + +* To separate signal (decoded from packets) and gaps (caused by packet losses) into separate frames. This simplifies implementation of PLC (packet loss concealment), as it always can either forward or interpolate the whole frame. + +\ + +* To implement soft reads (see above), by returning only data until next packet loss. + +When the caller gets ``StatusPart`` (no matter if it used ``ModeHard`` or ``ModeSoft``), it is supposed to repeat the call in a loop until it gets ``StatusOK`` (frame is fully read) or ``StatusDrain`` (read stopped early, may happen only with ``ModeSoft``). + +For simplicity, most pipeline elements just forward whatever status and frame they got and rely on upper levels to repeat the call if needed. The described loop is implemented in `mixer `_, which is the entry point to the readers pipeline. + +Frame status codes +================== + +Frame read and write operations return `status codes `_: + +* ``StatusOK`` + + Frame was successfully and fully read or written. + +* ``StatusPart`` + + Frame reader returns it when the frame was only partially read and has smaller duration than requested. This may happen during a soft read, or due to limitations or simplifications in implementation. For example, reader is allowed to truncate frame to fit maximum buffer size or chunk boundary. + + Frame writer never returns this status. + +* ``StatusDrain`` + + Frame reader returns it when there are no samples to read. This happens only for soft read when next packet is missing. It can't happen for hard read because missing packets are replaced with zeros or interpolation. + + Frame writer never returns this status. + +* *other code* + + Any other status indicates pipeline failure and typically causes session termination. + +Frame encoders and decoders +=========================== + +Frame `encoder `_ and `decoder `_ are interfaces that have implementations for various codecs, e.g. PCM or FLAC. + +Frame encoder is used on sender in `packetizer `_ to encode raw frame into opaque packet payload. Frame decoder is used on receiver in `depacketizer `_ to decode packet payload into raw frame. + +Some codecs may implement extra features besides encoding and decoding, e.g. Opus codec is capable of restoring lost packets with a reduced quality, if the packet following the lost one is available. diff --git a/docs/sphinx/internals/threads.rst b/docs/sphinx/internals/threads.rst index cf427ed88..131481b68 100644 --- a/docs/sphinx/internals/threads.rst +++ b/docs/sphinx/internals/threads.rst @@ -8,7 +8,7 @@ Threads and queues Threads ======= -Roc nodes (senders and receivers) typically employ several threads. +Roc senders and receivers typically employ several threads. * **Network I/O thread** @@ -34,7 +34,7 @@ Roc nodes (senders and receivers) typically employ several threads. Implemented by `ControlLoop `_ class from ``roc_ctl`` module. -Depending on sound system in use, sound I/O thread and pipeline thread may be the same thread. For example, on ALSA a single thread perform audio I/O and processing, and on PulseAudio, there are separate threads for I/O and processing. +Depending on sound system in use, sound I/O thread and pipeline thread may be the same thread. For example, on ALSA a single thread performs audio I/O and processing, and on PulseAudio, there are separate threads for I/O and processing. When the user uses ``roc_sender`` or ``roc_receiver`` from the :doc:`C library `, Roc does not manage sound I/O. It also does not create dedicated pipeline thread - instead, the user invokes pipeline processing on their own thread. @@ -43,7 +43,7 @@ Network and control threads belong to context. Sound I/O and pipeline threads, i Queues ====== -Threads in Roc typically don't have a lot of shared state. They are very isolated and communicate only via packet and task queues. With this approach, most components do not have to bother with synchronization. +Threads in Roc typically don't have a lot of shared state. They are very isolated and communicate only via packet, frame, or task queues. With this approach, most components do not have to bother with synchronization. The queues between threads are usually lock-free and on some platforms also wait-free, which helps to avoid priority inversion problems (when real-time or high-priority thread is blocked or delayed by low-priority threads). diff --git a/docs/sphinx/internals/timestamps.rst b/docs/sphinx/internals/timestamps.rst index a999da911..aa28c1289 100644 --- a/docs/sphinx/internals/timestamps.rst +++ b/docs/sphinx/internals/timestamps.rst @@ -15,7 +15,7 @@ Types of timestamps * RTS - receive timestamp * QTS - queue timestamp -**Stream timestamp** (STS) describes position of the first sample in packet or frame using abstract stream clock. +**Stream timestamp (STS)** describes position of the first sample in packet or frame using abstract stream clock. This clock corresponds to the sample source on sender and has sample rate of the stream. For example, if sender is 44100 Hz 2-channel stereo audio card, then STS is incremented by one each of two generated samples (left and right), and it happens 44100 times per second, according to the audio card clock. @@ -23,22 +23,26 @@ Note that this clock is in a different domain compared to sample sink on receive STS directly corresponds to the "timestamp" 32-bit field in RTP packets. STS starts from a random value (as required by RTP) and may periodically wrap. -**Capture timestamp** (CTS) describes the same event as STS, i.e. originating of the first sample in packet or frame, but using local Unix-time UTC clock, counting nanoseconds since Unix Epoch. +**Capture timestamp (CTS)** describes the same event as STS, i.e. originating of the first sample in packet or frame, but using local Unix-time UTC clock, counting nanoseconds since Unix Epoch. The clock for CTS always belongs to the local system, no matter if we're on sender or receiver: * On sender, CTS of packet or frame is set to the system time when its first sample was captured. -* On receiver, CTS is set to an estimation of the same value, converted to receiver system clock, i.e. the system time *of receiver* when the first sample was captured *on sender*. +* On receiver, CTS is set to an estimation of the same value, converted to receiver system clock, i.e. the system time *on receiver* when the first sample was captured *on sender*. Unlike STS, this field does not directly correspond to any field inside RTP packet. Instead, sender and receiver exchange RTCP packets which help them to map STS to CTS, as described in the further sections. -**Receive timestamp** (RTS) is the time when the packet reached incoming network queue. +**Receive timestamp (RTS)** is the time when the packet reached incoming network queue on receiver. The clock for RTS is the same as for CTS: local Unix-time UTC clock, counting nanoseconds since Unix Epoch. This timestamp is used only on receiver and only for packets. -**Queue timestamp** (QTS) is the time when the packet was transferred to a local queue of a sink-thread. The main difference with RTS is thread-switch time. +**Queue timestamp (QTS)** is the time when the packet reached pipeline queue on receiver. + +The clock for RTS is the same as for CTS: local Unix-time UTC clock, counting nanoseconds since Unix Epoch. + +The difference between RTS and QTS is the time the packet spends in incoming queue until pipeline thread fetches it. QTS allows us to account additional jitter introduced by processing and thread-switch time. This timestamp is used only on receiver and only for packets. @@ -49,11 +53,25 @@ Stream timestamps are used to position packets in the continuous stream of sampl Capture and receive timestamps have the following usages: -* Estimate end-to-end latency between sender and receiver. To compute it, receiver needs to find the difference between the time when the frame was captured (i.e. capture timestamp) and the time when the frame is actually played (which receiver knows). +* STS and CTS: + + * Estimate end-to-end latency between sender and receiver. + + To compute it, receiver needs to find the difference between the time when the frame was captured (i.e. capture timestamp) and the time when the frame is actually played (which receiver knows). + + * Maintain fixed end-to-end latency. + + If we have end-to-end latency metric on receiver, we can use it to drive latency tuning engine. Unlike NIQ latency (network queue size), end-to-end latency is very stable and allows more precise tuning and lower latency values. + + * Synchronize playback of receivers. + + For synchronous playback, it is enough to configure all receivers to maintain the same end-to-end latency. Since all of them will derive CTS from the same source (sender), playback will be automatically synchronous. + +* RTS and QTS: -* Maintain fixed end-to-end latency. If we have end-to-end latency metric on receiver, we can use it to drive clock synchronization engine. + * Estimate network jitter (using RTS) or network + processing + thread-switch jitter (using QTS). -* Synchronize playback of receivers. For synchronous playback, all receivers should be configured to maintain the same end-to-end latency. + Receiver monitors these timestamps to determine jitter of the arriving packets. Then it may instruct latency tuner to keep latency above the jitter to prevent glitches caused by jitter. Timestamp mapping ================= @@ -68,7 +86,7 @@ This is how it works: * Receiver maintains the same mapping, and updates this mapping whenever it receives an RTCP packet. Using this mapping, receiver is able to assign each packet a CTS based on its STS field. -This logic is implemented in ``TimestampExtractor`` (on sender) and ``TimestampInjector`` (on receiver). +This logic is implemented in `TimestampExtractor `_ (on sender) and `TimestampInjector `_ (on receiver). There are two subtleties here: diff --git a/src/internal_modules/Doxyfile b/src/internal_modules/Doxyfile index fc9b10b13..a56ed46ac 100644 --- a/src/internal_modules/Doxyfile +++ b/src/internal_modules/Doxyfile @@ -7,11 +7,15 @@ RECURSIVE = YES INPUT = . STRIP_FROM_PATH = . +STRIP_FROM_INC_PATH = . FILE_PATTERNS = *.h *.dox -EXCLUDE = \ +EXCLUDE = \ roc_core/target_posix/roc_core/cpu_traits.h +EXCLUDE_SYMBOLS = \ + AllocationPolicy + FULL_PATH_NAMES = YES SOURCE_BROWSER = YES @@ -25,8 +29,8 @@ REPEAT_BRIEF = YES HIDE_UNDOC_CLASSES = NO HIDE_UNDOC_MEMBERS = NO +HIDE_UNDOC_RELATIONS = NO STRIP_CODE_COMMENTS = NO - EXTRACT_ALL = NO EXTRACT_PRIVATE = NO @@ -40,6 +44,7 @@ EXPAND_ONLY_PREDEF = YES PREDEFINED = \ ROC_ATTR_NORETURN= \ + ROC_ATTR_NODISCARD= \ ROC_ATTR_PRINTF(x,y)= \ ROC_ATTR_PACKED_BEGIN= \ ROC_ATTR_PACKED_END= @@ -47,10 +52,17 @@ PREDEFINED = \ OUTPUT_DIRECTORY = ../../build/docs/internal_modules HTML_OUTPUT = ../../../docs/html/doxygen -GENERATE_LATEX = NO +CLASS_GRAPH = YES +COLLABORATION_GRAPH = NO +HAVE_DOT = YES DOT_GRAPH_MAX_NODES = 1000 +GENERATE_LATEX = NO + HTML_DYNAMIC_SECTIONS = NO +HTML_COLORSTYLE_HUE = 215 +HTML_COLORSTYLE_SAT = 80 +HTML_COLORSTYLE_GAMMA = 80 DISABLE_INDEX = NO GENERATE_TREEVIEW = YES SEARCHENGINE = YES diff --git a/src/internal_modules/main.dox b/src/internal_modules/main.dox index 2083a2602..2f2048085 100644 --- a/src/internal_modules/main.dox +++ b/src/internal_modules/main.dox @@ -14,9 +14,9 @@ namespace roc { } namespace roc { - //! @namespace roc::error - //! Error codes. - namepsace error {} + //! @namespace roc::status + //! Status codes. + namepsace status {} } namespace roc {