CalcMaxLateAllocLimit() minor sgpr calculation bug #84

shanminchao · 2022-02-01T09:38:24Z

In gfx9GraphicsPipeline.cpp, under the CalcMaxLateAllocLimit() function, you have the following:

const uint32 vsNumSgpr = (numSgprs * 8);
const uint32 vsNumVgpr = (numVgprs * 4);

My understanding is that it should be this instead for gfx9:

const uint32 vsNumSgpr = (numSgprs * 16);

Actually I'm wondering if you have to do a +1 as well before multiplying. The code does check whether vsNumSgprs is > 0 or not later on, but then uses it like so:

const uint32 maxSgprVsWaves = (chipProps.gfx9.numPhysicalSgprs / vsNumSgpr) * simdPerSh;

... which, unless numPhysicalSgprs is 0-based as well, seems to suggest the +1 should be there ...

The text was updated successfully, but these errors were encountered:

shanminchao · 2022-02-01T15:35:38Z

After some more investigation, I've come to realize that the issue is a bit more subtle than I expected.

The sgpr allocation granularity for my gfx9 card (Vega64) is 16. The register encoding of the sgpr is in multiples of 8, and 0-based. So therefore there's a need to round up to the nearest multiple of 16 after multiplying by 8. Furthermore, the +1 still seem to be missing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CalcMaxLateAllocLimit() minor sgpr calculation bug #84

CalcMaxLateAllocLimit() minor sgpr calculation bug #84

shanminchao commented Feb 1, 2022 •

edited

Loading

shanminchao commented Feb 1, 2022

CalcMaxLateAllocLimit() minor sgpr calculation bug #84

CalcMaxLateAllocLimit() minor sgpr calculation bug #84

Comments

shanminchao commented Feb 1, 2022 • edited Loading

shanminchao commented Feb 1, 2022

shanminchao commented Feb 1, 2022 •

edited

Loading