Evaluation of synchronization edge cases #53
Replies: 5 comments 2 replies
-
After collecting some data with @Poofjunior we have also came across an interesting delay in clock synchronization events between G and S.
What we see is that the harp behavior led turns on 170us after the Generator (we repeated the tests on two different boards). I should point out that, as I stressed out in a previous meeting, this has never been a problem before because the clock generators have always been mute in functionality past the synch event itself. However, should one add a digital input to the clock board and record events from the clock and the behavior board simultaneously, this delay would become apparent. This is very interesting because prior to this tests we acquired some data to verify if the behavior board and a pico core board were synchronized (see below) and came across an unexplained gap of 215us. This may be not because the pico core is not synchronized to the generator, but instead because the atmega core devices are not synchronized perfectly to the atxmega generator. If one were to take these differences in consideration, the final different is much closer to the expected 32us jitter (i.e. 215 - 170 = 45us). I will add the same oscilloscope test ran between the clock generator and a pico device later this week. As a final thought, I think that we should really revise the synchronization algorithm and ensure that we are all in agreement and also support new core implementations. |
Beta Was this translation helpful? Give feedback.
-
Another option is to calibrate each of the subordinate devices by |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Additional details/experimentsHeartbeats in Atxmega are not respecting the synchronization standardBrown = clock synchronizer input If everything was working as expected, one would assume that the heartbeat callbacks should be as close as possible to the 672us spec of the protocol. While this is true in the case of the White signal, the behavior board heartbeat appears to lag 225us (perfectly matching what we saw between the harp behavior and cuttlefish board above). Heartbeat-led may not be a good way to benchmark synchrony.We thought that maybe the led is not a perfect way to validate this approach. Indeed, if you see when the led of a clock generator blinks relative to ITS OWN generated clock synch signal (red against brown traces), we still see an unexpected 722 - 672 = 50us lag. It seems the atxmega core is 170us offThe previous two plots together with the one shown here (#53 (comment)) where (well this is assuming that the delay to turn on the led on top of the full second is shared across all atxmega devices) we remove the contribution of the 50us delay, we end up with roughly 170us. I believe this should be considered a bug given the current spec that should be patch as soon as possible to ensure interoperability across all devices. Moreover, we should write down somewhere what we expect to be validated from a new core. I believe that this should only be possible if the new core has a way to guarantee the materialization of a heartbeat callback in a very timely tight manner, but i am open to other suggestions. This way we could benchmark it against the generated clock signal and not to an already existing device. |
Beta Was this translation helpful? Give feedback.
-
The atxmega delay of 208 +/- 16 us was fixed on this commit. The current deviation is now a delay (A) to the the timestamp generator of 22 +/- 16 us or, in other words, between 6 and 38 which seems perfectly acceptable. The oscilloscope photo shows the delay saved during 2 hours. We can do better, which is to have a delay (B) between -6 and 29 us. I've used instruments to reduce the timestamp listener's cristal to 0 and 80 degrees to see a clock drifting for both sides and to see significant drift. The test shows that the implementation of delay (B) is robust but very very close to the edges, so I've preferred the implementation of delay (A). |
Beta Was this translation helpful? Give feedback.
-
This discussion is relative to #48
Definitions
Let:
S -> Subordinate device (the one that received the synchronization pulse)
G -> Clock generator device
SE-> Synchronization pulse/event, the sequence of bits sent from G to S each second to allow S to synchronize
Heartbeat -> the period event sent by the board roughly every second
Discussion points
Immediate vs scheduled (deferred) synchronization
If I am not mistaken the two cores use slightly different approaches.
The RP2040 tries to schedule a predicted heartbeat into the future (1 second) using the most recent synch event. This prevents double hits (e.g. the if S is going faster than G, it is possible that G might emit 2 heartbeat events back to back since it will cross the full second on its own clock and will be brought back by SE.) since the next second will always be assumed to be "correct" and any corrections will be applied to the next second scheduling.
The AtMega seems to force the synchronization, this would in theory allow for better synchronization at the cost of potential double hits
Drift per second as part of the spec.
Should we enforce a maximum drift per second?
How should it be benchmarked?
How to handle edge cases of "going back in time"?
Do you think it should be allowed?
Should we have an arbitrary threshold between small out-of-synch (e.g. < drift per second) and larger out-of-synch events (e.g. when forcing a new timestamp in G)
Beta Was this translation helpful? Give feedback.
All reactions