-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch skinned meshes on platforms where storage buffers are available. #16599
Batch skinned meshes on platforms where storage buffers are available. #16599
Conversation
This commit makes skinned meshes batchable on platforms other than WebGL 2. On such platforms, it replaces the two uniform buffers used for joint matrices with a pair of storage buffers containing all matrices for all skinned meshes concatenated together. The indices into the buffer are stored in the mesh uniform and mesh input uniform. The GPU mesh preprocessing step copies the indices in if that step is enabled. On the `many_foxes` demo, I observed a frame time decrease from 15.470ms to 11.935ms. This is the result of reducing the `submit_graph_commands` time from an average of 5.45ms to 0.489ms, an 11x speedup in that portion of rendering.
I'm not sure if this is working? I ran many_foxes through renderdoc, and I still see hundreds of vkCmdDraws, each with two vkCmdBindDescriptorSets before it. One for StandardMaterial, one for skinned_mesh_bind_group. I would've expected only one vkCmdBindDescriptorSets for skinned_mesh_bind_group, and then since the meshes are the same we don't have to rebind StandardMaterial either, and then you would be able to collapse all the draws down to one draw with instance_count=N. Am I missing why this doesn't seem to reduce commands? |
Windows, Intel 155h. |
It works on my M2 mac mini too. Hmm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works fine on my desktop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with the skinning system, but the code looks consistent and I tested many_foxes on both Linux+webgl2 and both works.
Running into a panic with this on the
backtrace
|
Looks like motion vectors are also broken with this (tested by adding the
backtrace
|
I fixed the motion vectors issue, but was unable to reproduce the |
I can't reproduce the Motion vectors no longer crash but they also don't appear to be working for skinned meshes now. Motion vector prepass texture once everything is rendered (left main, right this PR): |
This seems fine to merge, but I'm doing a manual example run here to check for regressions. Ping me once that's done / tomorrow if I forget. |
Ok, the skinned mesh motion vector thing should be fixed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm that motion vectors for skinned meshes appear to be fixed!
I believe the CI failure is #15981. |
Example run is green. |
bevyengine#16599) This commit makes skinned meshes batchable on platforms other than WebGL 2. On supported platforms, it replaces the two uniform buffers used for joint matrices with a pair of storage buffers containing all matrices for all skinned meshes packed together. The indices into the buffer are stored in the mesh uniform and mesh input uniform. The GPU mesh preprocessing step copies the indices in if that step is enabled. On the `many_foxes` demo, I observed a frame time decrease from 15.470ms to 11.935ms. This is the result of reducing the `submit_graph_commands` time from an average of 5.45ms to 0.489ms, an 11x speedup in that portion of rendering. ![Screenshot 2024-12-01 192838](https://github.com/user-attachments/assets/7d2db997-8939-466e-8b9e-050d4a6a78ee) This is what the profile looks like for `many_foxes` after these changes. ![Screenshot 2024-12-01 193026](https://github.com/user-attachments/assets/68983fc3-01b8-41fd-835e-3d93cb65d0fa) --------- Co-authored-by: François Mockers <[email protected]>
This commit makes skinned meshes batchable on platforms other than WebGL 2. On supported platforms, it replaces the two uniform buffers used for joint matrices with a pair of storage buffers containing all matrices for all skinned meshes packed together. The indices into the buffer are stored in the mesh uniform and mesh input uniform. The GPU mesh preprocessing step copies the indices in if that step is enabled.
On the
many_foxes
demo, I observed a frame time decrease from 15.470ms to 11.935ms. This is the result of reducing thesubmit_graph_commands
time from an average of 5.45ms to 0.489ms, an 11x speedup in that portion of rendering.This is what the profile looks like for
many_foxes
after these changes.