@@ -44,23 +44,31 @@ The difference is that the state of fences can be accessed from your program
4444using calls like ` vkWaitForFences ` and semaphores cannot be. Fences are mainly
4545designed to synchronize your application itself with rendering operation,
4646whereas semaphores are used to synchronize operations within or across command
47- queues. We want to synchronize the queue operations of draw commands and
48- presentation, which makes semaphores the best fit .
47+ queues. We will use a fence to synchronize swap chain image acquisition and a
48+ semaphore to synchronize drawing commands with presentation .
4949
50- ## Semaphores
50+ ## Fence
5151
52- We'll need one semaphore to signal that an image has been acquired and is ready
53- for rendering, and another one to signal that rendering has finished and
54- presentation can happen. Create two class members to store these semaphore
55- objects:
52+ A fence is a synchronization primitive in Vulkan that allows you to synchronize
53+ your program with GPU operations. You can set up GPU operations like drawing
54+ commands to put such a fence is a * signalled* state when they are completed and
55+ use a function call like ` vkWaitForFences ` in your own program to wait for them
56+ to become signalled. This is useful because many functions in Vulkan that start
57+ operations return immediately and the actual operation is finished at some point
58+ in the background. With fences you can wait for one or more of these operations
59+ to finish before continuing code execution.
60+
61+ We'll be using a fence to signal that an image has been acquired from the swap
62+ chain and is ready for rendering. Create a class member to store this fence
63+ object:
5664
5765``` c++
58- VkSemaphore imageAvailableSemaphore;
59- VkSemaphore renderFinishedSemaphore;
66+ VkFence imageAvailableFence;
6067```
6168
62- To create the semaphores, we'll add the last ` create ` function for this part of
63- the tutorial: ` createSemaphores ` :
69+ To create the fence, we'll add the last ` create ` function for this part of the
70+ tutorial: ` createSynchronizationPrimitives ` . It's named that way because we'll
71+ also create the semaphore in this function later on.
6472
6573``` c++
6674void initVulkan () {
@@ -76,46 +84,92 @@ void initVulkan() {
7684 createFramebuffers();
7785 createCommandPool();
7886 createCommandBuffers();
79- createSemaphores ();
87+ createSynchronizationPrimitives ();
8088}
8189
8290...
8391
84- void createSemaphores () {
92+ void createSynchronizationPrimitives () {
93+
94+ }
95+ ```
96+
97+ Creating a fence requires filling in the ` VkFenceCreateInfo ` struct, but in the
98+ current version of the API we'll only need to fill in the ` sType ` field. There
99+ is an optional ` flags ` field that allows you to initialize a fence to already be
100+ signalled, but we don't need that.
101+
102+ ``` c++
103+ void createSynchronizationPrimitives () {
104+ VkFenceCreateInfo fenceInfo = {};
105+ fenceInfo.sType = VK_STRUCTURE_TYPE_FENCE_CREATE_INFO;
106+ }
107+ ```
108+
109+ Creating the fence follows the familiar pattern with ` vkCreateFence ` :
85110
111+ ``` c++
112+ if (vkCreateFence(device, &fenceInfo, nullptr , &imageAvailableFence) != VK_SUCCESS) {
113+ throw std::runtime_error("failed to create fence!");
86114}
87115```
88116
89- Creating semaphores requires filling in the ` VkSemaphoreCreateInfo ` , but in the
90- current version of the API it doesn't actually have any required fields besides
91- ` sType ` :
117+ The fence should be cleaned up at the end of the program, when all commands have
118+ finished and no more synchronization is necessary:
119+
120+ ``` c++
121+ void cleanup () {
122+ vkDestroyFence (device, imageAvailableFence, nullptr);
123+ ```
124+
125+ ## Semaphore
126+
127+ We will also need to synchronize the completion of drawing operations with the
128+ operation to present an image to the screen. We could accomplish this with
129+ fences as well, but for this operation we'll look at another synchronization
130+ primitive: *semaphores*. Fences are required for synchronization between the
131+ code on the CPU and operations on the GPU, but semaphores are a more efficient
132+ way to synchronize only GPU operations.
133+
134+ We'll need a semaphore to signal that rendering has finished and presentation
135+ can happen. Add a class member to store this semaphore object:
136+
137+ ```c++
138+ VkFence imageAvailableFence;
139+ VkSemaphore renderFinishedSemaphore;
140+ ```
141+
142+ We'll continue working in the ` createSynchronizationPrimitives ` function.
143+ Creating semaphores requires filling in the ` VkSemaphoreCreateInfo ` struct, but
144+ in the current version of the API it doesn't actually have any required fields
145+ besides ` sType ` :
92146
93147``` c++
94- void createSemaphores () {
148+ void createSynchronizationPrimitives () {
149+ ...
150+
95151 VkSemaphoreCreateInfo semaphoreInfo = {};
96152 semaphoreInfo.sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO;
97153}
98154```
99155
100156Future versions of the Vulkan API or extensions may add functionality for the
101157` flags ` and ` pNext ` parameters like it does for the other structures. Creating
102- the semaphores follows the familiar pattern with ` vkCreateSemaphore ` :
158+ a semaphore is done through ` vkCreateSemaphore ` :
103159
104160``` c++
105- if (vkCreateSemaphore(device, &semaphoreInfo, nullptr , &imageAvailableSemaphore) != VK_SUCCESS ||
106- vkCreateSemaphore (device, &semaphoreInfo, nullptr, &renderFinishedSemaphore) != VK_SUCCESS) {
107-
108- throw std::runtime_error("failed to create semaphores!");
161+ if (vkCreateSemaphore(device, &semaphoreInfo, nullptr , &renderFinishedSemaphore) != VK_SUCCESS) {
162+ throw std::runtime_error("failed to create semaphore!");
109163}
110164```
111165
112- The semaphores should be cleaned up at the end of the program, when all commands
166+ The semaphore should be cleaned up at the end of the program, when all commands
113167have finished and no more synchronization is necessary:
114168
115169``` c++
116170void cleanup () {
117171 vkDestroySemaphore (device, renderFinishedSemaphore, nullptr);
118- vkDestroySemaphore (device, imageAvailableSemaphore , nullptr);
172+ vkDestroyFence (device, imageAvailableFence , nullptr);
119173```
120174
121175## Acquiring an image from the swap chain
@@ -128,7 +182,7 @@ convention:
128182```c++
129183void drawFrame() {
130184 uint32_t imageIndex;
131- vkAcquireNextImageKHR (device, swapChain, std::numeric_limits<uint64_t>::max(), imageAvailableSemaphore, VK_NULL_HANDLE , &imageIndex);
185+ vkAcquireNextImageKHR(device, swapChain, std::numeric_limits<uint64_t>::max(), VK_NULL_HANDLE, imageAvailableFence , &imageIndex);
132186}
133187```
134188
@@ -140,14 +194,34 @@ maximum value of a 64 bit unsigned integer disables the timeout.
140194The next two parameters specify synchronization objects that are to be signaled
141195when the presentation engine is finished using the image. That's the point in
142196time where we can start drawing to it. It is possible to specify a semaphore,
143- fence or both. We're going to use our ` imageAvailableSemaphore ` for that purpose
197+ fence or both. We're going to use our ` imageAvailableFence ` for that purpose
144198here.
145199
146200The last parameter specifies a variable to output the index of the swap chain
147201image that has become available. The index refers to the ` VkImage ` in our
148202` swapChainImages ` array. We're going to use that index to pick the right command
149203buffer.
150204
205+ This function returns immediately, possibly before an image is actually
206+ available. To wait for this to happen, we should now wait on our fence to be
207+ signalled using ` vkWaitForFences ` :
208+
209+ ``` c++
210+ vkWaitForFences (device, 1, &imageAvailableFence, VK_TRUE, std::numeric_limits<uint64_t>::max());
211+ ```
212+
213+ This function takes an array of fences to wait on. The fourth parameter can be
214+ set to `VK_TRUE` to indicate that all fences should be signalled or `VK_FALSE`
215+ if only one is sufficient. In our case it doesn't make a difference. The last
216+ parameter is a timeout parameter, just like the one in `vkAcquireNextImageKHR`.
217+
218+ The `vkWaitForFences` function doesn't automatically reset the fence, so it is
219+ still signalled at this point. We should reset it to be used for the next frame:
220+
221+ ```c++
222+ vkResetFences(device, 1, &imageAvailableFence);
223+ ```
224+
151225## Submitting the command buffer
152226
153227Queue submission and synchronization is configured through parameters in the
@@ -157,20 +231,15 @@ Queue submission and synchronization is configured through parameters in the
157231VkSubmitInfo submitInfo = {};
158232submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
159233
160- VkSemaphore waitSemaphores[] = {imageAvailableSemaphore};
161- VkPipelineStageFlags waitStages[ ] = {VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT};
162- submitInfo.waitSemaphoreCount = 1;
163- submitInfo.pWaitSemaphores = waitSemaphores;
164- submitInfo.pWaitDstStageMask = waitStages;
234+ submitInfo.waitSemaphoreCount = 0 ; // Optional
235+ submitInfo.pWaitSemaphores = nullptr ; // Optional
236+ submitInfo.pWaitDstStageMask = nullptr ; // Optional
165237```
166238
167- The first three parameters specify which semaphores to wait on before execution
168- begins and in which stage(s) of the pipeline to wait. We want to wait with
169- writing colors to the image until it's available, so we're specifying the stage
170- of the graphics pipeline that writes to the color attachment. That means that
171- theoretically the implementation can already start executing our vertex shader
172- and such while the image is not available yet. Each entry in the `waitStages`
173- array corresponds to the semaphore with the same index in `pWaitSemaphores`.
239+ The rendering operations should wait with writing to the image until after it
240+ has successfully been acquired. We've already explicitly synchronized this by
241+ waiting on the image acquisition fence to be signalled before submitting the
242+ drawing command buffer in the first place, so no semaphore is necessary here.
174243
175244``` c++
176245submitInfo.commandBufferCount = 1 ;
@@ -199,10 +268,9 @@ if (vkQueueSubmit(graphicsQueue, 1, &submitInfo, VK_NULL_HANDLE) != VK_SUCCESS)
199268
200269We can now submit the command buffer to the graphics queue using
201270` vkQueueSubmit ` . The function takes an array of ` VkSubmitInfo ` structures as
202- argument for efficiency when the workload is much larger. The last parameter
203- references an optional fence that will be signaled when the command buffers
204- finish execution. We're using semaphores for synchronization, so we'll just pass
205- a ` VK_NULL_HANDLE ` .
271+ argument for efficiency when the workload is much larger. The last parameter can
272+ be used to signal a fence once the drawing commands have finished execution, but
273+ we're using a semaphore in this case so we'll just pass a ` VK_NULL_HANDLE ` here.
206274
207275## Subpass dependencies
208276
@@ -324,7 +392,7 @@ from `debugCallback` tell us why:
324392
325393
326394
327- Remember that all of the operations in `drawFrame` are asynchronous. That means
395+ Remember that many of the operations in `drawFrame` are asynchronous. That means
328396that when we exit the loop in `mainLoop`, drawing and presentation operations
329397may still be going on. Cleaning up resources while that is happening is a bad
330398idea.
@@ -348,9 +416,43 @@ You can also wait for operations in a specific command queue to be finished with
348416perform synchronization. You'll see that the program now exits without problems
349417when closing the window.
350418
419+ ## Fence versus semaphore
420+
421+ We now used a fence to synchronize image acquisition with drawing commands and a
422+ semaphore to synchronize drawing with presentation. As you saw, it is also
423+ possible for ` vkAcquireNextImageKHR ` to signal a semaphore instead of a fence
424+ and have ` vkQueueSubmit ` wait on a semaphore through fields in ` VkSubmitInfo ` .
425+
426+ Doing this synchronization with a semaphore would look like this:
427+
428+ ``` c++
429+ vkAcquireNextImageKHR (device, swapChain, std::numeric_limits<uint64_t>::max(), imageAvailableSemaphore, VK_NULL_HANDLE, &imageIndex);
430+
431+ ...
432+
433+ VkSemaphore waitSemaphores[ ] = {imageAvailableSemaphore};
434+ VkPipelineStageFlags waitStages[ ] = {VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT};
435+ submitInfo.waitSemaphoreCount = 1;
436+ submitInfo.pWaitSemaphores = waitSemaphores;
437+ submitInfo.pWaitDstStageMask = waitStages;
438+ ```
439+
440+ The rendering operations would then wait with writing colors to the image until
441+ it's available. That means that theoretically the implementation can already
442+ start executing our vertex shader and such while the image is not available yet.
443+ Each entry in the `waitStages` array corresponds to the semaphore with the same
444+ index in `pWaitSemaphores`.
445+
446+ Semaphores offer more fine-grained control over synchronization than fences, and
447+ therefore they should always be used if possible. However, if you use no
448+ explicit synchronization with the CPU code at all through either fences or other
449+ commands like `vkQueueWaitIdle`, then it is harder for the validation layers to
450+ properly function and this may lead to memory leaks. That's why I've opted to
451+ explicitly synchronize image acquisition with a fence for this tutorial.
452+
351453## Conclusion
352454
353- About 800 lines of code later, we've finally gotten to the stage of seeing
455+ About 900 lines of code later, we've finally gotten to the stage of seeing
354456something pop up on the screen! Bootstrapping a Vulkan program is definitely a
355457lot of work, but the take-away message is that Vulkan gives you an immense
356458amount of control through its explicitness. I recommend you to take some time
0 commit comments