Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

eplankin
Copy link
Contributor

@eplankin eplankin commented Mar 25, 2022

Fixed issue #11303.
The problem was the following: in the parallel version of the function in case when kernelSize/2 is bigger than a height of a tile, out-of-bounds read happens when the second and the one before the last tiles are processed.
Now sequential mode is used in such cases.

@asmorkalov asmorkalov requested a review from alalek March 25, 2022 10:31
@eplankin eplankin marked this pull request as draft March 25, 2022 10:58
@alalek
Copy link
Member

alalek commented Mar 25, 2022

As a "bugfix" this patch should go into 3.4 branch first.
We will merge changes from 3.4 into 4.x regularly (weekly/bi-weekly).

Please:

  • change "base" branch of this PR: 4.x => 3.4 (use "Edit" button near PR title)
  • rebase your commits from 4.x onto 3.4 branch. For example:
    git rebase -i --onto upstream/3.4 upstream/4.x
    (check list of your commits, save and quit (Esc + "wq" + Enter)
    where upstream is configured by following this GitHub guide and fetched (git fetch upstream).
  • push rebased commits into source branch of your fork (with --force option)

Note: no needs to re-open PR, apply changes "inplace".

@eplankin eplankin changed the base branch from 4.x to 3.4 March 25, 2022 15:32
@alalek
Copy link
Member

alalek commented Mar 29, 2022

Problem is confirmed (4+ threads):

$ valgrind ./bin/opencv_test_imgproc --gtest_filter=*regression_11303 --test_threads=4
...
[ RUN      ] Imgproc_GaussianBlur.regression_11303
==697273== Thread 3:
==697273== Invalid read of size 32
==697273==    at 0x592CA8D: icv_l9_ownFilterGaussianRow_Brd_32f_C1 (in /home/alalek/projects/opencv/build/opencv/lib/libopencv_imgproc.so.3.4.17)
==697273==    by 0x59392FD: icv_l9_ippiFilterGaussian_32f_C1R_L (in /home/alalek/projects/opencv/build/opencv/lib/libopencv_imgproc.so.3.4.17)
==697273==    by 0x504436F: llwiFilterGaussian_Process (iw_image_filter_gaussian.c:385)
==697273==    by 0x50441AF: llwiFilterGaussian_ProcessWrap (iw_image_filter_gaussian.c:350)
==697273==    by 0x50437CF: llwiFilterGaussian_ProcessWrap (iw_image_filter_gaussian.c:241)
==697273==    by 0x50433F0: iwiFilterGaussian (iw_image_filter_gaussian.c:113)
==697273==    by 0x4D42A87: ipp::iwiFilterGaussian(ipp::IwiImage const&, ipp::IwiImage&, int, double, ipp::IwiFilterGaussianParams const&, ipp::IwiBorderType const&, ipp::IwiTile const&) (iw_image_filter.hpp:281)
==697273==    by 0x4D58E60: cv::ipp_gaussianBlurParallel::operator()(cv::Range const&) const (smooth.dispatch.cpp:507)
==697273==    by 0x72042D4: cv::(anonymous namespace)::ParallelLoopBodyWrapper::operator()(cv::Range const&) const (parallel.cpp:340)
==697273==    by 0x7206310: cv::ParallelJob::execute(bool) (parallel_impl.cpp:344)
==697273==    by 0x7206711: cv::WorkerThread::thread_body() (parallel_impl.cpp:468)
==697273==    by 0x7205F5D: cv::WorkerThread::thread_loop_wrapper(void*) (parallel_impl.cpp:284)
==697273==  Address 0x1265d864 is 1,785,108 bytes inside a block of size 1,785,132 alloc'd
==697273==    at 0x484486F: malloc (vg_replace_malloc.c:381)
==697273==    by 0x6F85D23: cv::fastMalloc(unsigned long) (alloc.cpp:150)
==697273==    by 0x712EFDE: cv::StdMatAllocator::allocate(int, int const*, int, void*, unsigned long*, int, cv::UMatUsageFlags) const (matrix.cpp:147)
==697273==    by 0x71317B3: cv::Mat::create(int, int const*, int) (matrix.cpp:645)
==697273==    by 0x71310B5: cv::Mat::create(int, int, int) (matrix.cpp:537)
==697273==    by 0x71300FB: cv::Mat::Mat(cv::Size_<int>, int, cv::Scalar_<double> const&) (matrix.cpp:371)
==697273==    by 0x64EC88: opencv_test::(anonymous namespace)::Imgproc_GaussianBlur_regression_11303_Test::Body() (test_filter.cpp:2364)
==697273==    by 0x64EB59: opencv_test::(anonymous namespace)::Imgproc_GaussianBlur_regression_11303_Test::TestBody() (test_filter.cpp:2358)
==697273==    by 0x76DBF0: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (ts_gtest.cpp:3917)
==697273==    by 0x767FCB: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (ts_gtest.cpp:3953)
==697273==    by 0x74C857: testing::Test::Run() (ts_gtest.cpp:3991)
==697273==    by 0x74D188: testing::TestInfo::Run() (ts_gtest.cpp:4167)


ocv_check_environment_variables(OPENCV_IPP_GAUSSIAN_BLUR)
option(OPENCV_IPP_GAUSSIAN_BLUR "Enable IPP optimizations for GaussianBlur (+8Mb in binary size)" OFF)
option(OPENCV_IPP_GAUSSIAN_BLUR "Enable IPP optimizations for GaussianBlur (+8Mb in binary size)" ON)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this change before merge

@eplankin
Copy link
Contributor Author

eplankin commented Apr 4, 2022

@alalek, I cannot reproduce invalid read with 4+ threads locally with the same ICV package. Valgrind log is below. Could you please share the whole valgrind log/other details?

==17091== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==17091== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==17091== Command: ./bin/opencv_test_imgproc --gtest_filter=*Imgproc_GaussianBlur*11303 --test-threads=5
==17091== 
CTEST_FULL_OUTPUT
OpenCV version: 3.4.17-dev
OpenCV VCS version: 3.4.17-116-g6a68ba1-dirty
Build type: Release
Compiler: /bin/c++  (ver 4.8.5)
Parallel framework: pthreads (nthreads=4)
CPU features: SSE SSE2 SSE3 *SSE4.1 *SSE4.2 *FP16? *AVX *AVX2?
Intel(R) IPP version: ippIP SSE4.2 (y8) 2020.0.0 Gold (-) Oct 19 2019
Intel(R) IPP features code: 0x80
OpenCL is disabled
TEST: Skip tests with tags: 'mem_6gb', 'verylong'
Note: Google Test filter = *Imgproc_GaussianBlur*11303
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Imgproc_GaussianBlur
[ RUN      ] Imgproc_GaussianBlur.regression_11303
[       OK ] Imgproc_GaussianBlur.regression_11303 (620 ms)
[----------] 1 test from Imgproc_GaussianBlur (626 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (642 ms total)
[  PASSED  ] 1 test.
==17091== 
==17091== HEAP SUMMARY:
==17091==     in use at exit: 50,446 bytes in 253 blocks
==17091==   total heap usage: 541,863 allocs, 541,610 frees, 52,342,802 bytes allocated
==17091== 
==17091== LEAK SUMMARY:
==17091==    definitely lost: 0 bytes in 0 blocks
==17091==    indirectly lost: 0 bytes in 0 blocks
==17091==      possibly lost: 4,676 bytes in 83 blocks
==17091==    still reachable: 45,770 bytes in 170 blocks
==17091==         suppressed: 0 bytes in 0 blocks
==17091== Rerun with --leak-check=full to see details of leaked memory
==17091== 
==17091== For counts of detected and suppressed errors, rerun with: -v
==17091== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)

@alalek
Copy link
Member

alalek commented Apr 4, 2022

I cannot reproduced

Message above is just a confirmation how to reproduce the problem, so we could validate proposed fix.


BTW,

--test-threads=5
(nthreads=4)

Looks like there are 4 logic CPUs only.


out-of-memory

It is called as "out of bounds" / OOB problem.


Please fix/eliminate build warning before merge.

@eplankin eplankin changed the title Fixed out-of-memory read in parallel version of ippGaussianBlur() Fixed out-of-bounds read in parallel version of ippGaussianBlur() Apr 5, 2022
@alalek
Copy link
Member

alalek commented Apr 5, 2022

Please convert PR from Draft if it is ready (use "Ready for review" button).

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you 👍

@eplankin eplankin marked this pull request as ready for review April 5, 2022 12:47
@alalek alalek merged commit d793ec2 into opencv:3.4 Apr 5, 2022
@opencv-pushbot opencv-pushbot mentioned this pull request Apr 16, 2022
@opencv-pushbot opencv-pushbot mentioned this pull request Apr 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants