Add H264RtpDepacketizer #1082

Sean-Der · 2023-12-30T03:36:24Z

Inverse of H264RtpPacketizer. Takes incoming H264 packets and emits H264 NALUs.

Sean-Der · 2023-12-30T03:37:11Z

@paullouisageneau can/should I add more fields to Message? It would be nice to know Duration + Discontinuity

paullouisageneau

@paullouisageneau can/should I add more fields to Message? It would be nice to know Duration + Discontinuity

Yes, feel free to add them, but not directly in Message as it adds overhead to every packet in every transport. A shared_ptr<FrameInfo> would make sense, like for the reliability information. You should also add the frame timestamp there.

src/h264rtpdepacketizer.cpp

paullouisageneau · 2024-01-07T19:01:58Z

src/h264rtpdepacketizer.cpp

+		auto first = this->rtp_buffer.begin();
+		auto last = this->rtp_buffer.begin() + (packets_in_timestamp - 1);
+
+		messages = buildFrame(first, last);


I think there are issues with the handling of messages. For instance:

If there is a single timestamp in rtp_buffer (for instance a single frame), no new frame is depacketized (because of the break just above), in that case it looks like input messages in messages won't be cleared and will leak to the next element in the media processing chain.

If there are two frames depacketized in a single call, messages will be replaced for each frame, so frames will be dropped and only the last one will be passed to the chain.

Maybe I'm missing parts of the logic but if the principle is to flush the current frame when the next timestamp is seen, couldn't such a simple approach do the job for H264RtpDepacketizer::incoming?

message_vector result; for (auto message : messages) { [...] // check message type and size auto p = reinterpret_cast<const RtpHeader *>(message->data()); if (!rtp_buffer.empty() && current_timestamp != p->timestamp()) { result.push_back(buildFrame(rtp_buffer.begin(), rtp_buffer.end())); rtp.buffer.clear(); } current_timestamp = p->timestamp(); rtp_buffer.push_back(std::move(message)); } messages.swap(result);

current_timestamp could be a class member (or read from a packet in rtp_buffer before the loop).

Both should be handled now!

If incoming RTP packets aren't enough to build a frame

messages.clear() is called so messages aren't leaked

many frames in a singe call

I merge the lists now. If a incoming RTP packet results in multiple frames being available it works!

It looks correct now, even if I'm still a bit puzzled by the convoluted approach.

include/rtc/h264rtpdepacketizer.hpp

paullouisageneau · 2024-01-07T19:40:20Z

Thank for adding the depacketizer, this is great!

For visibility, this PR partially implements #676.

walletiger · 2024-01-20T13:56:22Z

how to use this depacketizer in c_api ?

Sean-Der · 2024-02-16T17:54:06Z

@paullouisageneau can I get another review please! Sorry for the delay I will be on top of this now :)

src/h264rtpdepacketizer.cpp

paullouisageneau · 2024-02-17T15:08:12Z

src/h264rtpdepacketizer.cpp

+		auto firstByte = std::to_integer<uint8_t>(pkt->at(headerSize));
+		auto secondByte = std::to_integer<uint8_t>(pkt->at(headerSize + 1));
+		auto naluType = firstByte & naluTypeBitmask;


Is there a reason for redefining the parsing logic and constants rather than relying on helpers structs in nalunit.hpp?

@paullouisageneau Wasn't aware of it! I just looked a bit and I don't believe they are applicable.

nalunit.hpp seems to be just concerned with detecting/splitting NAL units and not the actual understanding of them?

I am all for expanding nalunit.hpp to include this logic also though if you want that in this commit.

Good point. No need to expand the logic, but maybe you could only use the header struct from nalunit.hpp to read the fields here?

paullouisageneau · 2024-02-17T19:25:18Z

src/h264rtpdepacketizer.cpp

+		auto first = this->rtp_buffer.begin();
+		auto last = this->rtp_buffer.begin() + (packets_in_timestamp - 1);
+
+		messages = buildFrame(first, last);


It looks correct now, even if I'm still a bit puzzled by the convoluted approach.

src/h264rtpdepacketizer.cpp

Sean-Der · 2024-02-18T02:57:52Z

@paullouisageneau Can I get another review please?

I also fixed the 'Outdated' comments also. I can't respond to them inline on GitHub though :/

Sean-Der · 2024-02-18T02:58:48Z

After this commits lands I am going to add FrameInfo and a Opus depacketizer! After that I can go back and add WHEP support to OBS.

Thanks for merging+reviewing so much @paullouisageneau

paullouisageneau

@Sean-Der It looks good, thank you for your work! Would you mind replacing the firstByte and secondByte manipulation with casts to NalUnitHeader and NalUnitFragmentHeader so it is not implemented twice?

Sean-Der · 2024-02-20T02:53:51Z

@paullouisageneau Done! Can I get another review?

I added one small method to NalUnitHeader and was able to drop the firstByte/secondByte.

With some more refactoring/exposing things we could drop even more.

src/h264rtpdepacketizer.cpp

Inverse of H264RtpPacketizer. Takes incoming H264 packets and emits H264 NALUs. Co-authored-by: Paul-Louis Ageneau <[email protected]>

Sean-Der · 2024-02-21T14:30:21Z

@paullouisageneau Ok I think I got it this time :) Mind taking a look and if this is good going to start Opus + FrameInfo

paullouisageneau

It looks good, thank you! If you have the opportunity to add Opus and FrameInfo metadata, I would be my pleasure to review it.

@Sean-Der

This commit adds an H265 depacketizer which takes incoming H265 RTP packets and emits H265 access units. It is closely based on the `H264RtpDepacketizer` added by @Sean-Der in paullouisageneau#1082. I originally started with a version of this commit that was closer to the `H264RtpDepacketizer` and which emitted individual H265 NALUs in `H265RtpDepacketizer::buildFrames()`. This resulted in calling my `Track::onFrame()` callback for each NALU, which did not work well with the decoder that I'm using which wants to see the VPS/SPS/PPS NALUs as a unit before initializing the decoder (https://intel.github.io/libvpl/v2.10/API_ref/VPL_func_vid_decode.html#mfxvideodecode-decodeheader). So for the `H265RtpDepacketizer` I've tried to make it emit access units rather than NALUs. An "access unit" is (RFC 7798): > A set of NAL units that are associated with each other according to a specified classification rule, that are consecutive in decoding order, *and that contain exactly one coded picture.* "Exactly one coded picture" seems to correspond with what a caller might expect an "onFrame" callback to do. Maybe the `H264RtpDepacketizer` should be revised to similarly emit H264 access units rather than NALUs, too. At least, I could not find a way to receive individual NALUs from the depacketizer and run the VPL decoder without needing to do my own buffering/copying of the NALUs. With this commit I can now do the following: * Generate encoded bitstream output from the Intel VPL encoder. * Pass the output of the encoder one frame at a time to libdatachannel's `Track::send()` on a track with an `H265RtpPacketizer` media handler. * Transport the video track over a WebRTC connection to a libdatachannel peer. * Depacketize it with the `H265RtpDepacketizer` media handler in this commit. * Pass the depacketized output via my `Track::onFrame()` callback to the Intel VPL decoder in "complete frame" mode (https://intel.github.io/libvpl/v2.10/API_ref/VPL_enums.html#_CPPv428MFX_BITSTREAM_COMPLETE_FRAME). Each "onFrame" callback corresponds to a single call to the decoder API to decode a frame.

@Sean-Der

This commit adds an H265 depacketizer which takes incoming H265 RTP packets and emits H265 access units. It is closely based on the `H264RtpDepacketizer` added by @Sean-Der in paullouisageneau#1082. I originally started with a version of this commit that was closer to the `H264RtpDepacketizer` and which emitted individual H265 NALUs in `H265RtpDepacketizer::buildFrames()`. This resulted in calling my `Track::onFrame()` callback for each NALU, which did not work well with the decoder that I'm using which wants to see the VPS/SPS/PPS NALUs as a unit before initializing the decoder (https://intel.github.io/libvpl/v2.10/API_ref/VPL_func_vid_decode.html#mfxvideodecode-decodeheader). So for the `H265RtpDepacketizer` I've tried to make it emit access units rather than NALUs. An "access unit" is (RFC 7798): > A set of NAL units that are associated with each other according to a specified classification rule, that are consecutive in decoding order, *and that contain exactly one coded picture.* "Exactly one coded picture" seems to correspond with what a caller might expect an "onFrame" callback to do. Maybe the `H264RtpDepacketizer` should be revised to similarly emit H264 access units rather than NALUs, too. At least, I could not find a way to receive individual NALUs from the depacketizer and run the VPL decoder without needing to do my own buffering/copying of the NALUs. With this commit I can now do the following: * Generate encoded bitstream output from the Intel VPL encoder. * Pass the output of the encoder one frame at a time to libdatachannel's `Track::send()` on a track with an `H265RtpPacketizer` media handler. * Transport the video track over a WebRTC connection to a libdatachannel peer. * Depacketize it with the `H265RtpDepacketizer` media handler in this commit. * Pass the depacketized output via my `Track::onFrame()` callback to the Intel VPL decoder in "complete frame" mode (https://intel.github.io/libvpl/v2.10/API_ref/VPL_enums.html#_CPPv428MFX_BITSTREAM_COMPLETE_FRAME). Each "onFrame" callback corresponds to a single call to the decoder API to decode a frame.

@Sean-Der

This commit adds an H265 depacketizer which takes incoming H265 RTP packets and emits H265 access units. It is closely based on the `H264RtpDepacketizer` added by @Sean-Der in paullouisageneau#1082. I originally started with a version of this commit that was closer to the `H264RtpDepacketizer` and which emitted individual H265 NALUs in `H265RtpDepacketizer::buildFrames()`. This resulted in calling my `Track::onFrame()` callback for each NALU, which did not work well with the decoder that I'm using which wants to see the VPS/SPS/PPS NALUs as a unit before initializing the decoder (https://intel.github.io/libvpl/v2.10/API_ref/VPL_func_vid_decode.html#mfxvideodecode-decodeheader). So for the `H265RtpDepacketizer` I've tried to make it emit access units rather than NALUs. An "access unit" is (RFC 7798): > A set of NAL units that are associated with each other according to a specified classification rule, that are consecutive in decoding order, *and that contain exactly one coded picture.* "Exactly one coded picture" seems to correspond with what a caller might expect an "onFrame" callback to do. Maybe the `H264RtpDepacketizer` should be revised to similarly emit H264 access units rather than NALUs, too. At least, I could not find a way to receive individual NALUs from the depacketizer and run the VPL decoder without needing to do my own buffering/copying of the NALUs. With this commit I can now do the following: * Generate encoded bitstream output from the Intel VPL encoder. * Pass the output of the encoder one frame at a time to libdatachannel's `Track::send()` on a track with an `H265RtpPacketizer` media handler. * Transport the video track over a WebRTC connection to a libdatachannel peer. * Depacketize it with the `H265RtpDepacketizer` media handler in this commit. * Pass the depacketized output via my `Track::onFrame()` callback to the Intel VPL decoder in "complete frame" mode (https://intel.github.io/libvpl/v2.10/API_ref/VPL_enums.html#_CPPv428MFX_BITSTREAM_COMPLETE_FRAME). Each "onFrame" callback corresponds to a single call to the decoder API to decode a frame.

@Sean-Der

This commit adds an H265 depacketizer which takes incoming H265 RTP packets and emits H265 access units. It is closely based on the `H264RtpDepacketizer` added by @Sean-Der in paullouisageneau#1082. I originally started with a version of this commit that was closer to the `H264RtpDepacketizer` and which emitted individual H265 NALUs in `H265RtpDepacketizer::buildFrames()`. This resulted in calling my `Track::onFrame()` callback for each NALU, which did not work well with the decoder that I'm using which wants to see the VPS/SPS/PPS NALUs as a unit before initializing the decoder (https://intel.github.io/libvpl/v2.10/API_ref/VPL_func_vid_decode.html#mfxvideodecode-decodeheader). So for the `H265RtpDepacketizer` I've tried to make it emit access units rather than NALUs. An "access unit" is (RFC 7798): > A set of NAL units that are associated with each other according to a specified classification rule, that are consecutive in decoding order, *and that contain exactly one coded picture.* "Exactly one coded picture" seems to correspond with what a caller might expect an "onFrame" callback to do. Maybe the `H264RtpDepacketizer` should be revised to similarly emit H264 access units rather than NALUs, too. At least, I could not find a way to receive individual NALUs from the depacketizer and run the VPL decoder without needing to do my own buffering/copying of the NALUs. With this commit I can now do the following: * Generate encoded bitstream output from the Intel VPL encoder. * Pass the output of the encoder one frame at a time to libdatachannel's `Track::send()` on a track with an `H265RtpPacketizer` media handler. * Transport the video track over a WebRTC connection to a libdatachannel peer. * Depacketize it with the `H265RtpDepacketizer` media handler in this commit. * Pass the depacketized output via my `Track::onFrame()` callback to the Intel VPL decoder in "complete frame" mode (https://intel.github.io/libvpl/v2.10/API_ref/VPL_enums.html#_CPPv428MFX_BITSTREAM_COMPLETE_FRAME). Each "onFrame" callback corresponds to a single call to the decoder API to decode a frame.

@Sean-Der

This commit adds an H265 depacketizer which takes incoming H265 RTP packets and emits H265 access units. It is closely based on the `H264RtpDepacketizer` added by @Sean-Der in paullouisageneau#1082. I originally started with a version of this commit that was closer to the `H264RtpDepacketizer` and which emitted individual H265 NALUs in `H265RtpDepacketizer::buildFrames()`. This resulted in calling my `Track::onFrame()` callback for each NALU, which did not work well with the decoder that I'm using which wants to see the VPS/SPS/PPS NALUs as a unit before initializing the decoder (https://intel.github.io/libvpl/v2.10/API_ref/VPL_func_vid_decode.html#mfxvideodecode-decodeheader). So for the `H265RtpDepacketizer` I've tried to make it emit access units rather than NALUs. An "access unit" is (RFC 7798): > A set of NAL units that are associated with each other according to a specified classification rule, that are consecutive in decoding order, *and that contain exactly one coded picture.* "Exactly one coded picture" seems to correspond with what a caller might expect an "onFrame" callback to do. Maybe the `H264RtpDepacketizer` should be revised to similarly emit H264 access units rather than NALUs, too. At least, I could not find a way to receive individual NALUs from the depacketizer and run the VPL decoder without needing to do my own buffering/copying of the NALUs. With this commit I can now do the following: * Generate encoded bitstream output from the Intel VPL encoder. * Pass the output of the encoder one frame at a time to libdatachannel's `Track::send()` on a track with an `H265RtpPacketizer` media handler. * Transport the video track over a WebRTC connection to a libdatachannel peer. * Depacketize it with the `H265RtpDepacketizer` media handler in this commit. * Pass the depacketized output via my `Track::onFrame()` callback to the Intel VPL decoder in "complete frame" mode (https://intel.github.io/libvpl/v2.10/API_ref/VPL_enums.html#_CPPv428MFX_BITSTREAM_COMPLETE_FRAME). Each "onFrame" callback corresponds to a single call to the decoder API to decode a frame.

@Sean-Der

This commit adds an H265 depacketizer which takes incoming H265 RTP packets and emits H265 access units. It is closely based on the `H264RtpDepacketizer` added by @Sean-Der in paullouisageneau#1082. I originally started with a version of this commit that was closer to the `H264RtpDepacketizer` and which emitted individual H265 NALUs in `H265RtpDepacketizer::buildFrames()`. This resulted in calling my `Track::onFrame()` callback for each NALU, which did not work well with the decoder that I'm using which wants to see the VPS/SPS/PPS NALUs as a unit before initializing the decoder (https://intel.github.io/libvpl/v2.10/API_ref/VPL_func_vid_decode.html#mfxvideodecode-decodeheader). So for the `H265RtpDepacketizer` I've tried to make it emit access units rather than NALUs. An "access unit" is (RFC 7798): > A set of NAL units that are associated with each other according to a specified classification rule, that are consecutive in decoding order, *and that contain exactly one coded picture.* "Exactly one coded picture" seems to correspond with what a caller might expect an "onFrame" callback to do. Maybe the `H264RtpDepacketizer` should be revised to similarly emit H264 access units rather than NALUs, too. At least, I could not find a way to receive individual NALUs from the depacketizer and run the VPL decoder without needing to do my own buffering/copying of the NALUs. With this commit I can now do the following: * Generate encoded bitstream output from the Intel VPL encoder. * Pass the output of the encoder one frame at a time to libdatachannel's `Track::send()` on a track with an `H265RtpPacketizer` media handler. * Transport the video track over a WebRTC connection to a libdatachannel peer. * Depacketize it with the `H265RtpDepacketizer` media handler in this commit. * Pass the depacketized output via my `Track::onFrame()` callback to the Intel VPL decoder in "complete frame" mode (https://intel.github.io/libvpl/v2.10/API_ref/VPL_enums.html#_CPPv428MFX_BITSTREAM_COMPLETE_FRAME). Each "onFrame" callback corresponds to a single call to the decoder API to decode a frame.

@Sean-Der

This commit adds an H265 depacketizer which takes incoming H265 RTP packets and emits H265 access units. It is closely based on the `H264RtpDepacketizer` added by @Sean-Der in paullouisageneau#1082. I originally started with a version of this commit that was closer to the `H264RtpDepacketizer` and which emitted individual H265 NALUs in `H265RtpDepacketizer::buildFrames()`. This resulted in calling my `Track::onFrame()` callback for each NALU, which did not work well with the decoder that I'm using which wants to see the VPS/SPS/PPS NALUs as a unit before initializing the decoder (https://intel.github.io/libvpl/v2.10/API_ref/VPL_func_vid_decode.html#mfxvideodecode-decodeheader). So for the `H265RtpDepacketizer` I've tried to make it emit access units rather than NALUs. An "access unit" is (RFC 7798): > A set of NAL units that are associated with each other according to a specified classification rule, that are consecutive in decoding order, *and that contain exactly one coded picture.* "Exactly one coded picture" seems to correspond with what a caller might expect an "onFrame" callback to do. Maybe the `H264RtpDepacketizer` should be revised to similarly emit H264 access units rather than NALUs, too. At least, I could not find a way to receive individual NALUs from the depacketizer and run the VPL decoder without needing to do my own buffering/copying of the NALUs. With this commit I can now do the following: * Generate encoded bitstream output from the Intel VPL encoder. * Pass the output of the encoder one frame at a time to libdatachannel's `Track::send()` on a track with an `H265RtpPacketizer` media handler. * Transport the video track over a WebRTC connection to a libdatachannel peer. * Depacketize it with the `H265RtpDepacketizer` media handler in this commit. * Pass the depacketized output via my `Track::onFrame()` callback to the Intel VPL decoder in "complete frame" mode (https://intel.github.io/libvpl/v2.10/API_ref/VPL_enums.html#_CPPv428MFX_BITSTREAM_COMPLETE_FRAME). Each "onFrame" callback corresponds to a single call to the decoder API to decode a frame.

paullouisageneau reviewed Jan 7, 2024

View reviewed changes

Sean-Der force-pushed the h264-rtp-depacketizer branch 4 times, most recently from ac302f1 to b42ff73 Compare February 16, 2024 17:51

paullouisageneau reviewed Feb 17, 2024

View reviewed changes

Sean-Der force-pushed the h264-rtp-depacketizer branch from b42ff73 to 838e21f Compare February 18, 2024 02:58

paullouisageneau reviewed Feb 19, 2024

View reviewed changes

Sean-Der force-pushed the h264-rtp-depacketizer branch from 838e21f to ee1b355 Compare February 20, 2024 02:53

paullouisageneau reviewed Feb 21, 2024

View reviewed changes

src/h264rtpdepacketizer.cpp Outdated Show resolved Hide resolved

Add H264RtpDepacketizer

70a1fc3

Inverse of H264RtpPacketizer. Takes incoming H264 packets and emits H264 NALUs. Co-authored-by: Paul-Louis Ageneau <[email protected]>

Sean-Der force-pushed the h264-rtp-depacketizer branch from ee1b355 to 70a1fc3 Compare February 21, 2024 14:29

paullouisageneau approved these changes Feb 22, 2024

View reviewed changes

paullouisageneau merged commit b7f1f03 into paullouisageneau:master Feb 22, 2024
12 checks passed

Sean-Der deleted the h264-rtp-depacketizer branch February 22, 2024 12:44

edmonds mentioned this pull request Mar 18, 2024

Add H265RtpDepacketizer #1134

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add H264RtpDepacketizer #1082

Add H264RtpDepacketizer #1082

Sean-Der commented Dec 30, 2023

Sean-Der commented Dec 30, 2023

paullouisageneau left a comment

paullouisageneau Jan 7, 2024

paullouisageneau Jan 7, 2024

Sean-Der Feb 16, 2024 •

edited

Loading

paullouisageneau Feb 17, 2024

paullouisageneau commented Jan 7, 2024

walletiger commented Jan 20, 2024

Sean-Der commented Feb 16, 2024

paullouisageneau Feb 17, 2024

Sean-Der Feb 18, 2024

paullouisageneau Feb 19, 2024

paullouisageneau Feb 17, 2024

Sean-Der commented Feb 18, 2024

Sean-Der commented Feb 18, 2024

paullouisageneau left a comment

Sean-Der commented Feb 20, 2024

Sean-Der commented Feb 21, 2024

paullouisageneau left a comment

Add H264RtpDepacketizer #1082

Add H264RtpDepacketizer #1082

Conversation

Sean-Der commented Dec 30, 2023

Sean-Der commented Dec 30, 2023

paullouisageneau left a comment

Choose a reason for hiding this comment

paullouisageneau Jan 7, 2024

Choose a reason for hiding this comment

paullouisageneau Jan 7, 2024

Choose a reason for hiding this comment

Sean-Der Feb 16, 2024 • edited Loading

Choose a reason for hiding this comment

paullouisageneau Feb 17, 2024

Choose a reason for hiding this comment

paullouisageneau commented Jan 7, 2024

walletiger commented Jan 20, 2024

Sean-Der commented Feb 16, 2024

paullouisageneau Feb 17, 2024

Choose a reason for hiding this comment

Sean-Der Feb 18, 2024

Choose a reason for hiding this comment

paullouisageneau Feb 19, 2024

Choose a reason for hiding this comment

paullouisageneau Feb 17, 2024

Choose a reason for hiding this comment

Sean-Der commented Feb 18, 2024

Sean-Der commented Feb 18, 2024

paullouisageneau left a comment

Choose a reason for hiding this comment

Sean-Der commented Feb 20, 2024

Sean-Der commented Feb 21, 2024

paullouisageneau left a comment

Choose a reason for hiding this comment

Sean-Der Feb 16, 2024 •

edited

Loading