Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NDI source stalls since timestamp changes #79

Open
normen opened this issue Oct 18, 2021 · 46 comments
Open

NDI source stalls since timestamp changes #79

normen opened this issue Oct 18, 2021 · 46 comments

Comments

@normen
Copy link

normen commented Oct 18, 2021

Hi,

Issue description:
When using a ndisrc as described in the README. the pipeline starts up but remains on time 0:00:00:000 forever, stalling the pipeline. In some random, rare cases, with uncompressed (non-HX) sources the pipeline starts to run.

After fighting with these stalling issues in the latest version for a while (both on MacOS and Windows) I found that the problem seems to exist since c98b626 - versions up to 7c3e50c seem to be working as expected.

Thanks for this plugin,
Normen

@sdroege
Copy link
Contributor

sdroege commented Oct 18, 2021

Can you get a debug log with GST_DEBUG=ndi*:9 and attach it here?

@sdroege
Copy link
Contributor

sdroege commented Oct 18, 2021

Also please try different timestamp-mode settings to check if you can see a pattern there about what works and what doesn't.

@normen
Copy link
Author

normen commented Oct 18, 2021

I already tried all timestamp modes to no avail, I'll try and create a log file.

@normen
Copy link
Author

normen commented Oct 18, 2021

Heres the log, it's quite long, I let it run for a short while:
ndilog.txt

@normen
Copy link
Author

normen commented Oct 18, 2021

I was wrong about the timestamp mode, it does "work" when I set the autosink to sync=false and the timestamp-mode to 2 - but I'd like to have control over the latency/timing, thats why I am using this in the first place, so sync off is no proper solution..

@sdroege
Copy link
Contributor

sdroege commented Oct 18, 2021

Thanks, I'll take a look through the log later. sync=false should work with any timestamp-mode because then timestamps are just going to be ignored :)

@normen
Copy link
Author

normen commented Oct 18, 2021

Well, it doesn't - it only works with mode 2 - maybe thats a hint? Thanks for looking into it!

@sdroege
Copy link
Contributor

sdroege commented Oct 18, 2021

How does your pipeline look like? Maybe a full GST_DEBUG=9 would be useful here...

@normen
Copy link
Author

normen commented Oct 18, 2021

Its a gst-launch pipeline exactly as in the README.

@normen
Copy link
Author

normen commented Oct 18, 2021

Heres a full log, 16MB..
ndilog.txt

@sdroege
Copy link
Contributor

sdroege commented Oct 19, 2021

In your full log you shouldn't have seen any video at all and also heard no audio.

You need to increase the queue sizes here or otherwise this won't work. The first audio arrives at 0.65s, the first video at 1.94s, but the queues only hold up to 1s in your case.

Try configuring both queues with max-size-bytes=0 max-size-buffers=0 max-size-time=5000000000 (5s).

@normen
Copy link
Author

normen commented Oct 19, 2021

Correct, I didn't see or hear anything - these logs are from the exact gst-launch pipeline in the README using the official "NDI HX Camera" app on iPhone/Pad as a source.

Increasing the max-size-time on both queues to 10s didn't really solve the issue - the video starts playing but after a few seconds I get the "dropped buffer" messages again, the audio is very choppy if its playing at all. I also get a lot of Too small/big slope errors..

@normen
Copy link
Author

normen commented Oct 19, 2021

Funnily, in a different context (obs-gstreamer plugin) the queue seems to work when using timestamp-mode=2 - no additional buffers, no different settings, even when the appsink is synced...

However I noticed that the pipeline doesn't seem to react when setting an overall latency on the pipeline - this usually works for USB and RTSP sources to delay/sync them. The NDI pipeline always comes in with the same latency, no matter what I set for the latency, like this:

    gst_pipeline_set_latency (GST_PIPELINE (data->pipe), latency * GST_MSECOND);
    gint cur_latency = gst_pipeline_get_latency(GST_PIPELINE(data->pipe)) / GST_MSECOND;
    blog(LOG_INFO, "Set latency for pipeline to %d", cur_latency);

@sdroege
Copy link
Contributor

sdroege commented Oct 19, 2021

Increasing the max-size-time on both queues to 10s didn't really solve the issue - the video starts playing but after a few seconds I get the "dropped buffer" messages again, the audio is very choppy if its playing at all. I also get a lot of Too small/big slope errors..

Right, that's the situation I would be interested in having logs for :)

If you set timestamp mode to receive-time then this shouldn't happen btw.

However I noticed that the pipeline doesn't seem to react when setting an overall latency on the pipeline - this usually works for USB and RTSP sources to delay/sync them. The NDI pipeline always comes in with the same latency, no matter what I set for the latency, like this:

That's a bad idea unless you know that the pipeline's latency is going to be lower than what you're configuring on it. This API should only be used if you want to do inter-device / inter-pipeline synchronization.

@normen
Copy link
Author

normen commented Oct 19, 2021

That's a bad idea unless you know that the pipeline's latency is going to be lower than what you're configuring on it. This API should only be used if you want to do inter-device / inter-pipeline synchronization.

Sure, as you explained in your great presentation :) However I should be getting an error/warning from the pipeline if thats not the case, right? I am using this to sync up multiple live camera sources (RTSP, USB, NDI) in OBS and it works a treat - except for NDI.

@normen
Copy link
Author

normen commented Oct 19, 2021

Heres a log with max-size-time at 10s, only GST_DEBUG=ndi*:9 as the log gets too big with a full debug:
ndilog.txt

@normen
Copy link
Author

normen commented Oct 19, 2021

Heres a full log with GST_DEBUG=6, hope thats enough.
ndilog.txt

@sdroege
Copy link
Contributor

sdroege commented Oct 20, 2021

Sure, as you explained in your great presentation :) However I should be getting an error/warning from the pipeline if thats not the case, right? I am using this to sync up multiple live camera sources (RTSP, USB, NDI) in OBS and it works a treat - except for NDI.

If the appsink in the end is sync=true, check in the debug logs when the buffers are arriving, what latency is actually configured on the appsink and what it's doing with regards to clock waiting. My guess is that either they're all arriving late, or there's a latency configuration problem on the application-side.


For the problem at hand here, can you confirm that setting timestamp-mode=receive-time also does not solve it for you?

@normen
Copy link
Author

normen commented Oct 20, 2021

If the appsink in the end is sync=true, check in the debug logs when the buffers are arriving, what latency is actually configured on the appsink and what it's doing with regards to clock waiting. My guess is that either they're all arriving late, or there's a latency configuration problem on the application-side.

I am syncing the cameras visually by filming the same metronome - when I change the pipeline latency value in a USB or RTSP pipeline the visual result is as expected - the camera comes in later or sooner. Except for the NDI pipeline which even with 10 seconds set as latency doesn't change its "real world" latency in the pipe.

When setting a pipeline latency that is too low on the RTSP, USB or NDI sources I immediately get a clock error with the attached debug description explaining that the items in the pipeline have a higher latency than what is configured. The NDI pipeline always seems to report an internal latency of 20ms, so when I set a latency of 10ms I get this:

12:03:14.552: GStreamer error: clock problem.
12:03:14.552: Additional debug info:
12:03:14.552: ../gst/gstpipeline.c(712): gst_pipeline_do_latency (): /GstPipeline:pipeline7:
12:03:14.552: Configured latency is lower than detected minimum latency: configured 0:00:00.010000000 < min 0:00:00.020000000

For the problem at hand here, can you confirm that setting timestamp-mode=receive-time also does not solve it for you?

Yes, the problem persists with timestamp-mode=receive-time - I think you should be able to reproduce it with the "NDI HX Camera" for iOS as I am experiencing it on both MacOS and Windows with that app.

@sdroege
Copy link
Contributor

sdroege commented Oct 20, 2021

Except for the NDI pipeline which even with 10 seconds set as latency doesn't change its "real world" latency in the pipe.

As said, check the appsink/clock logs :) This is a different problem to what this issue is about though.


I think you should be able to reproduce it with the "NDI HX Camera" for iOS as I am experiencing it on both MacOS and Windows with that app.

If you send me an iOS device, sure ;) I was testing with the NDI HX Camera app on Android with Linux as a receiver and that seems to be working correctly here.

I'll check if there's something useful in your logs later.

@normen
Copy link
Author

normen commented Oct 20, 2021

Except for the NDI pipeline which even with 10 seconds set as latency doesn't change its "real world" latency in the pipe.

As said, check the appsink/clock logs :) This is a different problem to what this issue is about though.

Heres a level 6 log of OBS running with only one Gst-NDI source, I can't find any hints that setting the latency to 250ms fails, still its not applied in reality:

obslog.txt

@sdroege
Copy link
Contributor

sdroege commented Oct 20, 2021

Like I said, everything's arriving too late in the sink

0:00:10.435205900  8364 000001E7B475C2C0 DEBUG               basesink gstbasesink.c:2785:gst_base_sink_do_sync:<video_appsink> clock returned 1, jitter  0:00:07.335453700

Means that this frame arrived 7.3s too late. As such, they will just be rendered immediately.

That might be a hint to the underlying problem here. Wrong timestamps in some sense.

@sdroege
Copy link
Contributor

sdroege commented Oct 20, 2021

Well yes, all frames have a PTS of 0.

@normen
Copy link
Author

normen commented Oct 20, 2021

But wheres that time difference coming from? Its by far not coming in 7 seconds late in reality. I can see the video, it's not 7 seconds late.

@sdroege
Copy link
Contributor

sdroege commented Oct 20, 2021

Wrong timestamping.

@normen
Copy link
Author

normen commented Oct 20, 2021

Okay, so based on the fact that I only get video from the "NDI HX Camera" when using timestamp-mode=2 (timecode) and that the video timecode then is 7 seconds off somehow the only question that remains is why doesn't it work when using receive-time, that should rule out timecode problems on the sender side, right?

Its certainly not the first time that I see clock/sync issues with NDI, even before I added gstreamer to the mix, so I won't rule out that this is an issue with the NDI runtime (using the latest V5 SDK on both windows and mac). But I only ever saw these extreme time differences when using gstreamer with it. With the gst-launch pipeline with 10s buffer I actually saw/heard the 7 seconds time difference between audio and video in some cases - in some not.

@sdroege
Copy link
Contributor

sdroege commented Oct 20, 2021

If you check the logs, all the NDIlib_audio_frame_v3_t and NDIlib_video_frame_v2_t have a timecode of 0. So if you use any of the timecode-based modes it's not going to work very well. We should probably detect this case though.

If you select a timestamp or receive time based mode then it should work though. Can you give me a log for timestamp-vs-receive-time and another one for receive-time?

@normen
Copy link
Author

normen commented Oct 20, 2021

Heres two logs, one with mode 0 and one with mode 4, neither yield an image in OBS, the previous was with mode 2, the only one that works somewhat.
obslog-mode-4.txt
obslog-mode-0.txt

@sdroege
Copy link
Contributor

sdroege commented Oct 20, 2021

Thanks, I meant mode 1 (timestamp-vs...) and not 0 (timecode-vs...) though, can you also attach that? 0 is also useful however.

Mode 2 can't really work if the NDI sender is passing 0 for all timecodes. And generally (unless you really know what you're doing), you want either mode 0, 1 or 4 :)

@normen
Copy link
Author

normen commented Oct 20, 2021

Right, sorry. Well if mode 2 is the only one that yields an image expect me to use that ;)
obslog-mode-1.txt (Didn't show an image)

@sdroege
Copy link
Contributor

sdroege commented Oct 21, 2021

Right, sorry. Well if mode 2 is the only one that yields an image expect me to use that ;)
obslog-mode-1.txt (Didn't show an image)

You didn't configure the queues to be large enough again. First audio arrived at 6.96s, first video at 8.31s and the queue in front of the audio sink ran full after 7s.

Can you get a new set of those 3 logs with big enough queues?

@normen
Copy link
Author

normen commented Oct 21, 2021

Okay, I don't get how thats necessary for internal receive times but here goes..
obslog-mode-4-10sbuf.txt
obslog-mode-1-10sbuf.txt
obslog-mode-0-10sbuf.txt

@sdroege
Copy link
Contributor

sdroege commented Oct 21, 2021

It's necessary so that both sinks actually receive data, consume it continuously and so I can see how the timestamps are in relation to the pipeline clock.

The mode 4 log (the first one), you still don't get any video/audio out of there? All video/audio frames arrive in time at the sinks there and it looks all ok. At least starting around 8.3s into the log, before that frames are too late and then slowly catching up. Which seems to be related to audio being available much before video. Needs some more analysis.

Mode 1 log should behave approximately the same.

Mode 0 log is a bit strange and needs some more analysis but it looks like data is arriving in approximately 1s batches, so a higher latency would be necessary to configure here.

I'll take another look in the next days, this needs some time.

@normen
Copy link
Author

normen commented Oct 21, 2021

Sorry, forgot to mention that yes, I do get an image with all of these settings, audio is a bit choppy at times and it is quite late (not 7 seconds though),

As I am really only interested in the video from the cameras (the microphones come in via a mixing desk) - are you saying I can avoid the issues by only using the video part of the demuxer pipeline (as there would be no syncing up)?

@sdroege
Copy link
Contributor

sdroege commented May 31, 2022

I finally got to look at these logs and trying to look into fixing the whole situation. I'll take your logs as one of my testcases for a more robust algorithm for the timestamps.

One thing that seems problematic in your log is

  1. timecodes are all set to 0 (OK, we should be able to handle that)
  2. timestamps between audio and video are apart 1.3s in the beginning, and audio timestamps are the same for all frames. This suggests a sender bug and we should try to detect this situation too somehow and fall back to using plain receive times here.

Figure_1

@normen
Copy link
Author

normen commented May 31, 2022

Okay, cool. The Sender is the official NDI HX Camera app for iOS in case it matters.

@sdroege
Copy link
Contributor

sdroege commented May 31, 2022

Yeah I heard a few times already that it is apparently broken. The Android version of the app is behaving correctly.

@normen
Copy link
Author

normen commented May 31, 2022

I guess it might also be a matter with the NDI core libraries on the "darwin" platforms.. I also get strange behavior when I try to display a NDI screen grab from an iMac.

sdroege added a commit to sdroege/gst-plugin-ndi that referenced this issue Jun 7, 2022
 - NDI HX Camera Android in the past used 1ns instead of 100ns as unit
   for timecodes/timestamps.
 - NDI HX Camera iOS uses 0 for all timecodes and the same non-zero
   value for all audio timestamps

Detect such situations and try to compensate for them. Also add a new
"auto" timestamping mode that prefers to use timecodes and otherwise
falls back to timestamps or receive times.

Fixes teltek#79
@sdroege
Copy link
Contributor

sdroege commented Jun 7, 2022

This should be fixed by #102 with the default settings now

@normen
Copy link
Author

normen commented Jun 10, 2022

Cool, it'll be some time until I can check but I'll report back!

@normen
Copy link
Author

normen commented Oct 5, 2022

So, coming back to testing this and on sources that have no audio the latest version works a treat. For sources with audio not so much, I couldn't get it to work with actual NDI hardware, or the NDI HX iPhone app or with the new NDICam App or even with a screen capture session. It always gets stuck at 00:00:00 and seems to drop pretty much all frames.

Heres a log with the new NDICam App:
log.txt

Edit: Also, I get errors when I don't grab the audio from the demuxer. I didn't know this kind of behavior from gstreamer, usually when I don't connect an audio sink the audio frames simply get dropped (e.g. RTSP).

@normen
Copy link
Author

normen commented Oct 5, 2022

D'oh, I just realized your PR wasn't merged yet, checked out your branch now and it works great, thanks a lot!

@sdroege
Copy link
Contributor

sdroege commented Oct 6, 2022

Thanks for testing :)

gstreamer-github pushed a commit to sdroege/gst-plugin-rs that referenced this issue Oct 12, 2022
 - NDI HX Camera Android in the past used 1ns instead of 100ns as unit
   for timecodes/timestamps.
 - NDI HX Camera iOS uses 0 for all timecodes and the same non-zero
   value for all audio timestamps

Detect such situations and try to compensate for them. Also add a new
"auto" timestamping mode that prefers to use timecodes and otherwise
falls back to timestamps or receive times.

Fixes teltek/gst-plugin-ndi#79
@sdroege
Copy link
Contributor

sdroege commented Oct 12, 2022

I've integrated the fix here now: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/tree/main/net/ndi

@normen
Copy link
Author

normen commented Oct 22, 2022

Hmmm.. It looks like I have to come back to this. With the latest version from the gst-plugins-rs I still get the described issues. Every time audio gets in the mix the stream stalls. When I have a source without audio everything works fine. One issue is that I can't drop the audio packets in a pipeline that has audio, I have to pick up the audio pad or I get a pipeline error. Usually in gstreamer I don't have to do that?

Also I get mixed up with all the versions.. It seems all versions that have the ndisrcdemux element show this behavior. The version from your private repo that has separate ndivideosrc and ndiaudiosrc elements works best for me. I am currently using that to stream my NDI cameras.

@sdroege
Copy link
Contributor

sdroege commented Oct 22, 2022

Can you report that over there with the exact pipeline you're using and a GST_DEBUG=6,ndi*:9 debug log? That should help finding the cause. Like this there's no actionable information :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants