You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/tutorials/core/how_to_use_OpenCV_parallel_for_new/how_to_use_OpenCV_parallel_for_new.markdown
+1-27Lines changed: 1 addition & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -96,36 +96,10 @@ When looking at the sequential implementation, we can notice that each pixel dep
96
96
97
97
@note Although values of a pixel in a particular stripe may depend on pixel values outside the stripe, these are only read only operations and hence will not cause undefined behaviour.
98
98
99
+
C++ 11 standard allows a parallel implementation with a lambda expression:
99
100
100
-
We first declare a custom class that inherits from @ref cv::ParallelLoopBody and override the `virtual void operator ()(const cv::Range& range) const`.
The range in the `operator ()` represents the subset of values that will be treated by an individual thread. Based on the requirement, there may be different ways of splitting the range which in turn changes the computation.
104
-
105
-
For example, we can either
106
-
1. Split the entire traversal of the image and obtain the [row, col] coordinate in the following way (as shown in the above code):
@note In our case, both implementations perform similarly. Some cases may allow better memory access patterns or other performance benefits.
122
-
123
-
To set the number of threads, you can use: @ref cv::setNumThreads. You can also specify the number of splitting using the nstripes parameter in @ref cv::parallel_for_. For instance, if your processor has 4 threads, setting `cv::setNumThreads(2)` or setting `nstripes=2` should be the same as by default it will use all the processor threads available but will split the workload only on two threads.
124
-
125
-
@note C++ 11 standard allows to simplify the parallel implementation by get rid of the `parallelConvolution` class and replacing it with lambda expression:
0 commit comments