-
Notifications
You must be signed in to change notification settings - Fork 409
Multistream rendering and instancing
In this lesson we learn how to use multistream rendering to implement GPU instancing.
First create a new project using the instructions from the earlier lessons: Using DeviceResources and Adding the DirectX Tool Kit which we will use for this lesson.
For these tutorial lessons, we've been providing a single stream of vertex data to the input assembler. Generally the most efficient rendering is single vertex buffer with a stride of 16, 32, or 64 bytes, but there are times when arranging the render data in such a layout is expensive. The Direct3D Input Assembler can therefore pull vertex information from up to 32 vertex buffers. This provides a lot of freedom in managing your vertex buffers.
For example, if we return to a case from Simple rendering, here is the 'stock' vertex input layout for VertexPositionNormalTexture
:
const D3D12_INPUT_ELEMENT_DESC c_InputElements[] =
{
{ "SV_Position", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "NORMAL", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "TEXCOORD", 0, DXGI_FORMAT_R32G32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
};
This describes a single vertex stream with three elements. We could arrange this into three VBs as follows:
// Position in VB#0, NORMAL in VB#1, TEXCOORD in VB#2
const D3D12_INPUT_ELEMENT_DESC c_InputElements[] =
{
{ "SV_Position", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "NORMAL", 0, DXGI_FORMAT_R32G32B32_FLOAT, 1, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "TEXCOORD", 0, DXGI_FORMAT_R32G32_FLOAT, 2, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
};
To render, we'd need to create an Pipeline State Object (PSO) for this input layout, and then bind the vertex buffers to each slot:
D3D12_VERTEX_BUFFER_VIEW vbViews[3] = {};
vbViews[0].BufferLocation = ...;
vbViews[0].StrideInBytes = sizeof(float) * 3;
vbViews[0].SizeInBytes = ...;
vbViews[1].BufferLocation = ...;
vbViews[1].StrideInBytes = sizeof(float) * 3;
vbViews[1].SizeInBytes = ...;
vbViews[2].BufferLocation = ...;
vbViews[2].StrideInBytes = sizeof(float) * 2;
vbViews[2].SizeInBytes = ...;
commandList->IASetVertexBuffers(0, 3, &vbViews);
Note if we are using DrawIndexed
, then the same index value is used to retrieve the 'ith' element from each vertex buffer (i.e. there is only one index per vertex, and all VBs must be at least as long as the highest index value).
In addition to pulling vertex data from multiple streams, the input assembler can also 'loop' over some streams to implement a feature called "instancing". Here the same vertex data is drawing multiple times with some per-vertex data changing "once per instance" as it loops over the other data. This allows you to efficiently render a large number of the same object in many locations, such as grass or boulders.
The NormalMapEffect supports GPU instancing using a per-vertex XMFLOAT3X4
matrix which can include translations, rotations, scales, etc. For example if we were using VertexPositionNormalTexture
model data with instancing, we'd create an input layout as follows:
// VertexPositionNormalTexture in VB#0, XMFLOAT3X4 in VB#1
const D3D12_INPUT_ELEMENT_DESC c_InputElements[] =
{
{ "SV_Position", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "NORMAL", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "TEXCOORD", 0, DXGI_FORMAT_R32G32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "InstMatrix", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 1, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_INSTANCE_DATA, 1 },
{ "InstMatrix", 1, DXGI_FORMAT_R32G32B32A32_FLOAT, 1, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_INSTANCE_DATA, 1 },
{ "InstMatrix", 2, DXGI_FORMAT_R32G32B32A32_FLOAT, 1, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_INSTANCE_DATA, 1 },
};
Here the first vertex buffer has enough data for one instance, and the second vertex buffer has as many entries as instances.
UNDER CONSTRUCTION
- GPU instancing is also supported by DebugEffect and PBREffect
Next lessons: Using HDR rendering
All content and source code for this package are subject to the terms of the MIT License.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
- Universal Windows Platform apps
- Windows desktop apps
- Windows 11
- Windows 10
- Xbox One
- Xbox Series X|S
- x86
- x64
- ARM64
- Visual Studio 2022
- Visual Studio 2019 (16.11)
- clang/LLVM v12 - v18
- MinGW 12.2, 13.2
- CMake 3.20