Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@tailsu
Copy link
Contributor

@tailsu tailsu commented Jan 17, 2024

libjpeg-turbo has optimized built-in support for encoding/decoding from/to BGR and BGRA [1].

This PR integrates this optimized support when available. As a result both encoding and decoding from/to BGR are faster by ~2% and the same for BGRA is faster by ~3%, measured on a Mac M1 (ARM64) with a 512x512 image. Grayscale performance is unchanged.

Test code

TEST(Imgcodecs_Jpeg, encode_benchmark)
{
    cvtest::TS& ts = *cvtest::TS::ptr();
    string input = string(ts.get_data_path()) + "../cv/shared/lena.png";
    cv::Mat img = cv::imread(input);
    ASSERT_FALSE(img.empty());

    // uncomment to test BGRA branch
    // cv::cvtColor(img, img, cv::COLOR_BGR2BGRA);

    int bestTimeUs = -1;
    for (int i = 0; i < 1000; ++i) {
        std::vector<uchar> output;
        std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();
        cv::imencode(".jpg", img, output);
        cv::Mat roundtrip = cv::imdecode(output, cv::IMREAD_COLOR);
        std::chrono::high_resolution_clock::time_point end = std::chrono::high_resolution_clock::now();
        int timeUs = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
        if (bestTimeUs == -1 || timeUs < bestTimeUs) {
            bestTimeUs = timeUs;
        }

        if (i == 0) {
            cv::imwrite("roundtrip.jpg", roundtrip);
        }
    }
    std::cout << "Best time: " << bestTimeUs << " us" << std::endl;
}

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalov
Copy link
Contributor

asmorkalov commented Jan 23, 2024

Performance results for Intel i5-1135G7 (mobile), system instance of libjpeg (Ubuntu 20.04):

Geometric mean (ms)

Name of Test 4.x-1  patched-1 patched-1 
                                  vs    
                                4.x-1   
                              (x-factor)
Decode::JPEG 44.903  34.509      1.30   
Encode::JPEG 39.511  31.819      1.24   
decode::PNG  29.752  29.849      1.00   
encode::PNG  55.645  55.357      1.01

@asmorkalov
Copy link
Contributor

asmorkalov commented Jan 23, 2024

Jetson TK1 (ARM v7), CPU_BASELINE="", no NEON. libjpeg - system instance of libjpeg (Ubuntu 16.04).

ubuntu@jetson1:~/Projects/perf-jpeg-turbo$ python3 ../opencv/modules/ts/misc/summary.py ./4.x-1.xml ./patched-1.xml 

Geometric mean (ms)

Name of Test  4.x-1  patched-1 patched-1 
                                   vs    
                                 4.x-1   
                               (x-factor)
Decode::JPEG 153.790  130.682     1.18   
Encode::JPEG 194.503  168.693     1.15   
decode::PNG  82.813   82.154      1.01   
encode::PNG  135.009  135.075     1.00   

@asmorkalov
Copy link
Contributor

The same Intel system with self-built libjpeg from 3rdparty with NASM installed:

Geometric mean (ms)

Name of Test 4.x-1  patched-1 patched-1 
                                  vs    
                                4.x-1   
                              (x-factor)
Decode::JPEG 43.528  34.220      1.27   
Encode::JPEG 34.849  28.169      1.24   
decode::PNG  29.651  29.787      1.00   
encode::PNG  55.511  55.781      1.00 

@asmorkalov
Copy link
Contributor

The same Intel system with self-built libjpeg from 3rdparty without NASM:

Geometric mean (ms)

Name of Test  4.x-1  patched-1 patched-1 
                                   vs    
                                 4.x-1   
                               (x-factor)
Decode::JPEG 79.568   71.328      1.12   
Encode::JPEG 132.320  124.864     1.06   
decode::PNG  29.715   29.695      1.00   
encode::PNG  55.693   55.663      1.00  

@asmorkalov
Copy link
Contributor

Jetson-tk1 result with self-built libjpeg with NASM:

Geometric mean (ms)

Name of Test  4.x-1  patched-1 patched-1 
                                   vs    
                                 4.x-1   
                               (x-factor)
Decode::JPEG 134.620  130.283     1.03   
Encode::JPEG 136.431  116.768     1.17   
decode::PNG  83.308   83.628      1.00   
encode::PNG  137.066  136.243     1.01

Copy link
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks for the great job! The patch makes sense, even if NASM assembler is not found.

@asmorkalov asmorkalov merged commit 48ba45f into opencv:4.x Jan 23, 2024
@asmorkalov asmorkalov self-assigned this Jan 23, 2024
@tailsu tailsu deleted the sd/jpeg-turbo-color-extensions branch January 23, 2024 14:20
@asmorkalov asmorkalov mentioned this pull request Jan 23, 2024
@Kumataro Kumataro mentioned this pull request Mar 27, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants