-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Desync in network play #964
Comments
It may be worth setting up component hashing on Bones Schema so that we can produce hashes of the world snapshots for comparison. I can't remember how deeply I integrated it, but we already have a We'd need to do something like a Then we can implement a function on the |
Great info, thanks. Yeah I started looking at producing hash/checksum we can use with ggrs to benefit from their desync detection (ggrs supports passing this in with the saved cell, and will check for desyncs between players at specified interval). I hacked in a callback user can specify to generate hash (in lieu of global solution), and hashed only player transforms, found desync'd on 2nd frame (even in 2 player match, apparently)
Added a debug log in networking to dump log inputs to be activated with and found that one player gets:
Other gets:
DenseMoveDirection is different on match start for one player, possibly a bug in input management. Tracking that down now. On the note of advanced desync debugging: ggrs has a SyncTestSession that will rollback-resim every frame, and compare original hash with re-simulated hash. We could also take the state of last confirmed frame not desync'd after detection in normal play, dump the inputs for future frames that desync'd, and run a sync test session and local client can test for non-determinism (if re-simulating gives different results with same input). We could even do something fancy in which we manage data structure that breaks down hashes by subdividing the data, like hash only resources or only components, drill down by type, and compare those to determine what exact data is non-deterministic. Haven't really thought that through and don't need a real solution yet. Full world hashing would be a great first step. But at least this issue seems to have some clues without getting too far in the weeds. |
Ah so it seems that on first frame when no previous input, GGRS provides a blank / predicted input of dense input type that is DenseMoveDir encodes vec value between -1 and 1 as quantized integer, so So predicted input at first frame is not no input. |
Perhaps it would make sense to consider if supporting Default trait in ggrs would be a good solution. Otherwise we could quantize each move dir value between 0 and 1 with 5 bits, and then include a sign bit, as the range is symmetric. Should be the same precision I think, but zero'd would be 0. |
Opened a PR that changes representation so zeroed is no input: fishfolk/bones#380 Is a possible solution, seems to fix this bug. Though I still can easily desync clients in 4 player matches, so there are other issues at play. |
I opened a draft PR in GGRS that seems to fix a isolated test case I setup. In smoke testing with 3+ clients in Jumpy I am no longer reproducing desyncs between players. Definitely some more work to do before this would be ok to merge upstream + it needs some review for sure. Will test jumpy a bit more in depth next couple of days and see if can repro any other desync, but barring more issues I'll likely put us on a fork so can validate for a bit and see if any other issues pop up. |
#967 should fix the remaining desync issues (using the fix I linked above on fork of ggrs), no longer reproducing this, gonna close it. Regarding desync detection - opened a issue tracking that on bones in the meantime until I get to wrapping that up. |
Super awesome, this is a really big deal! 🎉 I'd never actually gotten the game not to de-sync after some playing, so this is exciting to finally not be able to re-produce de-syncs. |
Description
If I boot up 3-4 clients (4 is more frequent) can pretty easily desync one or multiple of them with simple gameplay.
The client performance is terrible with so many locally, and my ping is high, so this is kind of worst case condition, but something seems to be wrong here. Going to try using some of the ggrs tools to track this down. This is no main without any of my local WIP stuff.
To Reproduce
Expected Behavior
No response
Additional Context
EDIT: I reproduced this at ca53ba9 (before package + physics updates), so can probably rule out some of these recent changes. May be a more fundamental issue.
Log Messages
No response
The text was updated successfully, but these errors were encountered: