-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing branches for Reaper #1414
Comments
Hmm, that's interesting as I got slightly lower fps on metro and habitat with https://github.com/VReaperV/Daemon/tree/material-stages-tex. I'm guessing it's down to a difference in how the drivers are handling the respective buffers. |
It looks like the perceived slowdown I was getting was actually due to bugs on master, now I get same or higher fps on the branch above. |
Testing: https://github.com/VReaperV/Daemon/tree/test-no-multidraw
The difference betwee multidraw or not can be noise. |
Yes, I redone “multridraw no-material” with plat23, and now it is:
|
Hmm, interesting, I got slightly better performance with the test-no-multidraw branch, but maybe that was just a fluke. It's interesting that habitat now shows better performance with material system than otherwise, compared to the first test here. Probably due to the fixes I made earlier. |
Oh, btw @illwieckz , what result do you get on master/test-no-multidraw without |
I've been thinking that by quantising the stage data and offloading textures to a buffer with a fixed layout might improve this further. For context, right now each drawSurf gets its own copy of the surface data in the buffer. This means that there's a lot of data being duplicated. Additionally, it currently spans 128b and 192b for generic and lightMapping shaders, which are 2 of the most abundant ones, which means the former can only fit 0.5 or 1 in a typical cache line, while the latter will overfetch. And increases bandwidth usage for updating this data. It also makes merging surfaces into one draw command impossible (unless switching to Vulkan, or using an Nvidia extension which didn't even work in that regard on my end). The reason each surface copies its data is because (and I tried just storing data per-stage first instead) some of the data: lightmap, deluxemap and light factor, is per-surface. The https://github.com/VReaperV/Daemon/tree/material-stages-tex branch offloads some of the data to a different buffer to workaround this issue. However, after looking at the shaders and uniforms, I believe I can fit all of the generic and lightMapping shader stage data into 8 and 20 bytes per stage respectively, while storing the textures in a different, fixed-layout buffer. The stage can then even be put into a uniform buffer, which might work a little faster. 16 bits could be used to index it, with the remaining 16 bits used to store light factor and an index to textures and lightmap/deluxemap. Light factor can even be just 1 bit since it's always either 1.0 or map light factor, which can be set as a global uniform. Only the texture index would then prevent merging different surfaces (other than having a different material, that is), since textures can only be indexed with a dynamically uniform value. From my testing it seems this should allow merging lots of different surfaces. The https://github.com/VReaperV/Daemon/tree/material-clusters branch was an attempt at merging surfaces by using texture arrays and binding textures per material (with a texture layer and scale used in the shader for each relevant one), which worked alright on my end (sans some bugs at surface edges), but didn't really seem to give a performance benefit. On Mesa/AMD it was slower than current material system at tested by @illwieckz, however maybe using per-stage material data would help with this. It does, however, seem that I overcomplicated that branch (it even copies vertexes, not just the indexes, and for each view), and the surface merging can probably be better achieved in another pass: the cull and surface processing shaders are already very fast, especially if the subgroup extension is supported. |
Testing: https://github.com/VReaperV/Daemon/tree/material-stages-tex
System:
GPU: AMD Radeon PRO W7600
CPU: AMD Ryzen Threadripper PRO 3955WX
resolution: 3840×2160
preset: ultra
Framerate on default spectator scenes:
The text was updated successfully, but these errors were encountered: