ANGLE's Vulkan back-end implementation lives in this folder.
Vulkan is an explicit graphics API. It has a lot in common with other explicit APIs such as Microsoft's D3D12 and Apple's Metal. Compared to APIs like OpenGL or D3D11 explicit APIs can offer a number of significant benefits:
- Lower API call CPU overhead.
- A smaller API surface with more direct hardware control.
- Better support for multi-core programming.
- Vulkan in particular has open-source tooling and tests.
[TOC]
The RendererVk is a singleton. RendererVk owns shared global resources like the
VkDevice, VkQueue, the Vulkan format tables and
internal Vulkan shaders. The back-end creates a new ContextVk instance
to manage each allocated OpenGL Context. ContextVk processes state changes and handles action
commands like glDrawArrays
and glDrawElements
.
Typical OpenGL programs issue a few small state change commands between draw call commands. We want the typical app's use case to be as fast as possible so this leads to unique performance challenges.
Vulkan in quite different from OpenGL because it requires a separate compiled VkPipeline for each state vector. Compiling VkPipelines is multiple orders of magnitude slower than enabling or disabling an OpenGL render state. To speed this up we use three levels of caching when transitioning states in the Vulkan back-end.
The first level is the driver's VkPipelineCache. The driver cache reduces pipeline recompilation time significantly. But even cached pipeline recompilations are orders of manitude slower than OpenGL state changes.
The second level cache is an ANGLE-owned hash map from OpenGL state vectors to compiled pipelines. See GraphicsPipelineCache in vk_cache_utils.h. ANGLE's GraphicsPipelineDesc class is a tightly packed 256-byte description of the current OpenGL rendering state. We also use a xxHash for the fastest possible hash computation. The hash map speeds up state changes considerably. But it is still significantly slower than OpenGL implementations.
To get best performance we use a transition table from each OpenGL state vector to neighbouring
state vectors. The transition table points from GraphicsPipelineCache entries directly to
neighbouring VkPipeline objects. When the application changes state the state change bits are
recorded into a compact bit mask that covers the GraphicsPipelineDesc state vector. Then on the next
draw call we scan the transition bit mask and compare the GraphicsPipelineDesc of the current state
vector and the state vector of the cached transition. With the hash map we compute a hash over the
entire state vector and then do a 256-byte memcmp
to guard against hash collisions. With the
transition table we will only compare as many bytes as were changed in the transition bit mask. By
skipping the expensive hashing and memcmp
we can get as good or faster performance than native
OpenGL drivers.
Note that the current design of the transition table stores transitions in an unsorted list. If applications map from one state to many this will slow down the transition time. This could be improved in the future using a faster look up. For instance we could keep a sorted transition table or use a small hash map for transitions.
ANGLE converts application shaders into Vulkan VkShaderModules through a series of steps:
-
ANGLE Internal Translation: The initial calls to
glCompileShader
are passed to the ANGLE shader translator. The translator compiles application shaders into Vulkan-compatible GLSL. Vulkan-compatible GLSL matches the GL_KHR_vulkan_glsl extension spec with some additional workarounds and emulation. We emulate OpenGL's different depth range, viewport y flipping, default uniforms, and OpenGL line segment rasterization. For more info see TranslatorVulkan.cpp. After initial compilation the shaders are not complete. They are templated with markers that are filled in later at link time. -
Link-Time Translation: During a call to
glLinkProgram
the Vulkan back-end can know the necessary locations and properties to write to connect the shader stage interfaces. We get the completed shader source using ANGLE's GlslangWrapper helper class. We still cannot generateVkShaderModules
since some ANGLE features like OpenGL line rasterization emulation depend on draw-time information. -
Draw-time SPIR-V Generation: Once the application records a draw call we use Khronos' glslang to convert the Vulkan-compatible GLSL into SPIR-V with the correct draw-time defines. The SPIR-V is then compiled into
VkShaderModules
. For details please see GlslangWrapper.cpp. TheVkShaderModules
are then used byVkPipelines
. Note that we currently don't use SPIRV-Tools to perform any SPIR-V optimization. This could be something to improve on in the future.
See the below diagram for a high-level view of the shader translation flow:
OpenGL and Vulkan both render line segments as a series of pixels between two points. They differ in which pixels cover the line.
For single sample rendering Vulkan uses an algorithm based on quad coverage. A small shape is extruded around the line segment. Samples covered by the shape then represent the line segment. See the Vulkan spec for more details.
OpenGL's algorithm is based on Bresenham's line algorithm. Bresenham's algorithm selects pixels on the line between the two segment points. Note Bresenham's does not support multisampling. When compared visually you can see the Vulkan line segment rasterization algorithm always selects a superset of the line segment pixels rasterized in OpenGL. See this example:
The OpenGL spec defines a "diamond-exit" rule to select fragments on a line. Please refer to the 2.0 spec section 3.4.1 "Basic Line Segment Rasterization" spec for more details. To implement this rule we inject a small computation to test if a pixel falls within the diamond in the start of the pixel shader. If the pixel fails the diamond test we discard the fragment. Note that we only perform this test when drawing lines. See the section on Shader Compilation for more info. See the below diagram for an illustration of the diamond rule:
We can implement the OpenGL test by checking the intersection of the line and the medial axes of the
pixel p
. If the length of the line segment between intersections p
and the point center is
greater than a half-pixel for all possible p
then the pixel is not on the segment. To solve for
p
we use the pixel center a
given by gl_FragCoord
and the projection of a
onto the line
segment b
given by the interpolated gl_Position
. Since gl_Position
is not available in the
fragment shader we must add an internal position varying when drawing lines.
The full code derivation is omitted for brevity. It reduces to the following shader snippet:
vec2 b = ((position * 0.5) + 0.5) * gl_Viewport.zw + gl_Viewport.xy;
vec2 ba = abs(b - gl_FragCoord.xy);
vec2 ba2 = 2.0 * (ba * ba);
vec2 bp = ba2 + ba2.yx - ba;
if (bp.x > epsilon && bp.y > epsilon)
discard;
Note that we must also pass the viewport size as an internal uniform. We use a small epsilon value
to correct for cases when the line segment is perfectly parallel or perpendicular to the window. For
code please see TranslatorVulkan.cpp under
AddLineSegmentRasterizationEmulation
.
The required Vulkan format support tables do not implement the full set of formats needed for OpenGL conformance with extensions. ANGLE emulates missing formats using format overrides and format fallbacks.
An override implements a missing GL format with a required format in all cases. For example, the
luminance texture format L8_UNORM
does not exist in Vulkan. We override L8_UNORM
with the
required image format R8_UNORM
.
A fallback is one or more non-required formats ANGLE checks for support at runtime. For example,
R8_UNORM
is not a required vertex buffer format. Some drivers do support R8_UNORM
for vertex
buffers. So at runtime we check for sampled image support and fall back to R32_FLOAT
if R8_UNORM
is not supported.
Overrides and fallbacks are implemented in ANGLE's [Vulkan format
table][vk_format_table_autogen.cpp]. The table is auto-generated by
gen_vk_format_table.py
. We store the mapping from
angle::Format::ID
to VkFormat in
vk_format_map.json
. The format map also lists the overrides and fallbacks.
To update the tables please modify the format map JSON and then run
scripts/run_code_generation.py
.
The vk::Format
class describes the information ANGLE needs for a particular
VkFormat
. The 'ANGLE' format ID is a reference to the front-end format. The 'Image' or 'Buffer'
format are the native Vulkan formats that implement a particular front-end format for VkImages
and
VkBuffers
. For the above example of R8_UNORM
overriding L8_UNORM
, L8_UNORM
is the ANGLE
format and R8_UNORM
is the Image format.
For more information please see the source files.