Rendering multiple frames in flight, to avoid stalls, requires per frame resources. Such a requirement, highlights the upper limit on allocations, in Vulkan. As we all know sharing is caring, lets try sharing the resources. Really fundamental stuff but can’t be ignored.
Uniform buffer is considered as an example. Sharing configurations available:
- descriptorSets : 1
uniformBuffer : 1
uniformBufferMemory : 1 - descriptorSets : CONCURRENT_FRAMES
uniformBuffer : CONCURRENT_FRAMES
uniformBufferMemory : CONCURRENT_FRAMES - descriptorSets : CONCURRENT_FRAMES
uniformBuffer : CONCURRENT_FRAMES
uniformBufferMemory : 1 - descriptorSets : CONCURRENT_FRAMES
uniformBuffer : 1
uniformBufferMemory : 1 - descriptorSets : 1
uniformBuffer : 1 (Dynamic Uniform Buffer)
uniformBufferMemory : 1
Lets analyse each of the above configs.
1. This config will be useful in cases where uniform data is not getting changed in each frame. A uniform set representing the surface (in a lighting shader) of an object is a good candidate for such a config. The same config if used for transformation uniform set might not serve well as values will get overwritten in subsequent concurrent frames.
2. If the application has a lot of objects, then this config can be avoided.
5. Config 5 looks the best as a single descriptor set can be bound at different offsets. May be part-3 will address this config.
The blog will focus on config 3 & 4. So lets bring in some code.
Config 3:
descriptorSets : CONCURRENT_FRAMES
uniformBuffer : CONCURRENT_FRAMES
uniformBufferMemory : 1
The buffer size needs to be aligned in order to create the buffer.
if (info.gpu_props.limits.minUniformBufferOffsetAlignment){
dataSize = (dataSize+info.gpu_props.limits.minUniformBufferOffsetAlignment — 1) &~(info.gpu_props.limits.minUniformBufferOffsetAlignment — 1); }
The code snippet starts off with creating CONCURRENT_FRAMES number of buffers, each with same data size. Since all of them have the same data size, any of them can be used to get the memory requirement for the buffer. The memory should be allocated to accomodate CONCURRENT_FRAMES buffers. Hence the memory allocation size is memReq.size*CONCURRENT_FRAMES. The rest of the allocation code involves finding the right type of memory for the buffer. Among many references Intel’s vulkan tutorials can be used to figure out the heap type search. Another good source is Qualcomm’s video.
Until now the buffer and memory were created but they both were in their own world. In order to use the buffer, it needs to be linked to the allocated memory. Such a need get handled by bindBufferMemory function. The offset (3rd param) is the interesting one. It needs to follow the rules of alignment. As Vulkan spec states:
memoryOffset must be an integer multiple of the alignment member of theVkMemoryRequirements structure returned from a call to vkGetBufferMemoryRequirements with buffer
In case, if the above constraint is ignored, validation will come up with a message asking to do so. In our case we had used memory requirement as a delimiter, hence we didn’t face any such issues, as Vulkan takes into account memory alignment when it returns the memory requirements for the buffer.
The descriptorBufferInfo contains offset and range. As we are using a buffer to represent a single uniform struct, the offset will be 0 and range will be dataSize ( = sizeOf(UniformStruct)). Now our descriptor sets are linked to the buffer.
Copying data to the buffer memory, to make it accessible to shaders. The offsets will be :
memReq.size * 0 for first frame
memReq.size * 1 for second frame and so on.
The range will be of dataSize. Memory mapping is a heavy process, if done wisely, it might not affect the frame rates. If the memory is getting shared then while copying data at different offsets of the same memory, mapping/unmapping is needed just once.
The overall picture looks something as above.The blue colored boxes are alignment size ( padding ) which varies according to the data size.
Config 4 analysis in Part-2.