Yolo Object Detection API #24691

Abdurrahheem · 2023-12-12T18:35:43Z

This PR introduces a unified API for yolo object detection family.

Note: This is a preliminary PR, it is not to merged but is proposed to get feedback

Issue to fix:

Implement a sample for education purposes
Add support for rescaling output boxes to original image size
Add support for darknet yolo detectors (currently on onnx are supported)
Add support for image preprocessing for each yolo (not sure if we need this in the API explicitly)

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

LaurentBerger · 2023-12-12T19:58:25Z

I have already start one pr #24267 for yolox

Abdurrahheem · 2023-12-12T21:03:09Z

I have already start one pr #24267 for yolox

This is a unified API for all yolo detectors

LaurentBerger · 2023-12-13T07:23:10Z

I have already start one pr #24267 for yolox

This is a unified test for all yolo detectors

wrong it's like my pr but for all yolo models
@fengyuentau

Abdurrahheem · 2023-12-13T07:56:13Z

I have already start one pr #24267 for yolox

This is a unified test for all yolo detectors

wrong it's like my pr but for all yolo models @fengyuentau

Sorry, meant API for yolo detectors

LaurentBerger · 2023-12-13T08:01:22Z

I have already start one pr #24267 for yolox

This is a unified test for all yolo detectors

wrong it's like my pr but for all yolo models @fengyuentau

Sorry, meant API for yolo detectors

I made this PR because of this

opencv-alalek · 2023-12-13T14:21:24Z

modules/dnn/include/opencv2/dnn/dnn.hpp

+         CV_DEPRECATED_EXTERNAL  // avoid using in C++ code (need to fix bindings first)
+         YoloObjectDetector();
+
+         CV_WRAP void detect(InputArray frame, CV_OUT std::vector<int>& classIds,


We should not have this here.
User uses DetectionModel::detect() from the base class instead. Inheritance is designed for that.

Type of box rectangle is different from detection model's detect method.

How user and bindings should work with that?
Inheritance is not just a word.
Why DetectionModel doesn't need that method overload?

I have just checked the bindings. Call to detect method in python returns floating points for boxes as expected.
So, I do not understand, what is the issue?
I have overridden the detect method since I changed box from Rect to Rect2d.
Do you want me to keep it as in the ancestor class? If yes, why?

Is there any reason for object detection use Rect2d? All subsequent actions with detected object require conversion to Rect2i anyway (ROI subtraction, drawing).

In that regard you are right, there is no point. I was coming from the point of view "what base model outputs, opencv should return it as well". You suggest to change it to Rect?

Anyway we apply some postprocessing so output is not an origin. So yes, integer type is enough for such a high level API and I cannot image the case when float/double precision is required.

Why do we still have this here?

User uses DetectionModel::detect() from the base class instead.

I overload detect in Imlp class. Without definition in the base class in fails to build. Could can check on your side

opencv-alalek · 2023-12-13T14:22:31Z

modules/dnn/include/opencv2/dnn/dnn.hpp

+                             CV_OUT std::vector<float>& confidences, CV_OUT std::vector<Rect2d>& boxes,
+                             float confThreshold = 0.5f, float nmsThreshold = 0.0f);
+
+         CV_WRAP void postProccess(


It is better to move it outside of this class with appropriate names. Like NMSBoxes()

Made it static

modules/dnn/include/opencv2/dnn/dnn.hpp

modules/dnn/src/model.cpp

Abdurrahheem · 2023-12-14T09:07:58Z

@dkurt @akarsakov @opencv-alalek some points on which I need you opinion

Do we need images prepossessing as a part of API?
Do we need image rescaling as part as a part of API?

modules/dnn/src/model.cpp

modules/dnn/include/opencv2/dnn/dnn.hpp

dkurt · 2023-12-15T12:15:43Z

modules/dnn/src/model.cpp

+    ImagePaddingMode paddingMode;
+    YoloVersion yoloVersion;
+    float padValue;
+    float Height, Width;


Suggested change

float Height, Width;

Size frameSize;

when I use Size instead of floats model predictions start drifting for some unknown reason

Probably because of integer division. Please rename at least to lower case height and width

asmorkalov · 2023-12-15T15:10:38Z

 modules/dnn/include/opencv2/dnn/dnn.hpp:1655: trailing whitespace.
+     {

modules/dnn/test/test_model.cpp

modules/dnn/include/opencv2/dnn/dnn.hpp

dkurt · 2023-12-19T15:39:37Z

modules/dnn/src/model.cpp

+        padValue = padValue_;
+    }
+
+    void boxRescale(CV_OUT std::vector<Rect>& boxes){


Can you use blobRectToImageRect by @LaurentBerger?

Can be done in a separate PR

modules/dnn/src/model.cpp

modules/dnn/include/opencv2/dnn/dnn.hpp

opencv-alalek · 2023-12-18T21:07:22Z

modules/dnn/include/opencv2/dnn/dnn.hpp

+         CV_DEPRECATED_EXTERNAL  // avoid using in C++ code (need to fix bindings first)
+         YoloObjectDetector();
+
+         CV_WRAP void detect(InputArray frame, CV_OUT std::vector<int>& classIds,


Why do we still have this here?

User uses DetectionModel::detect() from the base class instead.

opencv-alalek · 2023-12-18T21:08:27Z

modules/dnn/src/model.cpp

+    ImagePaddingMode paddingMode;
+    YoloVersion yoloVersion;
+    float padValue;
+    float Height, Width;


opencv-alalek · 2023-12-18T21:09:17Z

modules/dnn/src/model.cpp

+    std::vector<Mat> detections;
+    impl.dynamicCast<YOLODetectionModel_Impl>()->processFrame(frame, detections);
+
+    postProccess(


No business logic is allowed in PImpl methods.

Where should this logic exist if not in overridden method of child class?

Obviously in implementation class.

then why does DetectionModel have the same "business logic" implemented in the main class? Take a look here

It should be refactored too (moved to Impl part).

Check TextDetectionModel => TextDetectionModel_EAST / TextDetectionModel_DB

Other minor fixes from PR

dkurt · 2023-12-22T06:21:12Z

modules/dnn/src/model.cpp

+        for(auto & detection : detections){
+            cv::transposeND(detection, {0, 2, 1}, detection);
+        }
+    }


Move to the loop below

dkurt · 2023-12-22T06:21:38Z

modules/dnn/src/model.cpp

+}
+
+void YOLODetectionModel::postProccess(
+    std::vector<Mat>& detections,


Suggested change

std::vector<Mat>& detections,

const std::vector<Mat>& detections,

dkurt · 2023-12-22T06:22:35Z

modules/dnn/src/model.cpp

+            } else {
+                CV_Error(Error::StsNotImplemented, "Unsupported padding mode");
+            }
+    }


Code style issue

dkurt · 2023-12-22T06:24:18Z

modules/dnn/src/model.cpp

+        net.forward(outs, outNames);
+    }
+
+    void setPaddingValue(const float padValue_){


const can be avoided

dkurt · 2023-12-22T06:25:46Z

modules/dnn/src/model.cpp

+YOLODetectionModel::YOLODetectionModel(const String& model, const String& config)
+{
+    impl = makePtr<YOLODetectionModel_Impl>();
+    impl->initNet(readNet(model, config));


So let's use readNetFromDarknet

dkurt · 2023-12-22T06:26:58Z

modules/dnn/src/model.cpp

+    for (auto preds : detections)
+    {
+        if (!darknet)
+            preds = preds.reshape(1, preds.size[1]);


This step is valid for every model. So can we remove the if condition?

Not valid. For yolov5 and larger model not valid

Can you please share the shapes for them?

dkurt · 2023-12-22T06:29:59Z

modules/dnn/src/model.cpp

+
+    if (!darknet && detections[0].size[1] < detections[0].size[2]) {
+        yolov8 = true;  // Set the correct flag based on tensor shape
+    }


darknet flag can be removed completely considering https://github.com/opencv/opencv/pull/24691/files#r1434763871 and the following change:

if (detections[0].dims == 3 && detections[0].size[1] < detections[0].size[2]) { yolov8 = true; // Set the correct flag based on tensor shape }

Documentation for Yolo usage in Opencv #24898 This PR introduces documentation for the usage of yolo detection model family in open CV. This is not to be merge before #24691, as the sample will need to be changed. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

Abdurrahheem added the category: dnn label Dec 12, 2023

Abdurrahheem added this to the 4.9.0 milestone Dec 12, 2023

Abdurrahheem requested review from asmorkalov, dkurt and fengyuentau December 12, 2023 18:35

Abdurrahheem self-assigned this Dec 12, 2023

opencv-alalek added the RFC label Dec 12, 2023

opencv-alalek reviewed Dec 13, 2023

View reviewed changes

dkurt reviewed Dec 14, 2023

View reviewed changes

modules/dnn/include/opencv2/dnn/dnn.hpp Outdated Show resolved Hide resolved

dkurt reviewed Dec 14, 2023

View reviewed changes

modules/dnn/include/opencv2/dnn/dnn.hpp Outdated Show resolved Hide resolved

dkurt reviewed Dec 14, 2023

View reviewed changes

modules/dnn/src/model.cpp Outdated Show resolved Hide resolved

fengyuentau reviewed Dec 14, 2023

View reviewed changes

modules/dnn/src/model.cpp Show resolved Hide resolved

dkurt reviewed Dec 15, 2023

View reviewed changes

modules/dnn/src/model.cpp Show resolved Hide resolved

dkurt reviewed Dec 15, 2023

View reviewed changes

modules/dnn/include/opencv2/dnn/dnn.hpp Show resolved Hide resolved

dkurt reviewed Dec 15, 2023

View reviewed changes

modules/dnn/include/opencv2/dnn/dnn.hpp Outdated Show resolved Hide resolved

dkurt reviewed Dec 15, 2023

View reviewed changes

asmorkalov marked this pull request as ready for review December 17, 2023 10:13

asmorkalov reviewed Dec 18, 2023

View reviewed changes

modules/dnn/test/test_model.cpp Show resolved Hide resolved

modules/dnn/test/test_model.cpp Show resolved Hide resolved

dkurt reviewed Dec 19, 2023

View reviewed changes

modules/dnn/include/opencv2/dnn/dnn.hpp Show resolved Hide resolved

dkurt reviewed Dec 19, 2023

View reviewed changes

modules/dnn/src/model.cpp Outdated Show resolved Hide resolved

dkurt reviewed Dec 19, 2023

View reviewed changes

modules/dnn/src/model.cpp Outdated Show resolved Hide resolved

opencv-alalek reviewed Dec 19, 2023

View reviewed changes

Abdurrahheem added 9 commits December 22, 2023 08:52

renamed yolo model. fixed enum values

63c735b

added support for rescaling boxes back to original image scale.

4faa996

Other minor fixes from PR

fix rescaling

0367e33

rect instead of rect2d

341e314

working rescaling with all types of preprocessing

962d3f8

added tests

914f1c6

get rid of darknet and yolov8

18a1524

fixes to PR comments. Removed dependcy yolo version complitely

ff97989

change frame size and yolov8 identification strategy

21c3d60

dkurt reviewed Dec 22, 2023

View reviewed changes

modules/dnn/src/model.cpp

} else {

CV_Error(Error::StsNotImplemented, "Unsupported padding mode");

}

}

Copy link

Member

dkurt Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style issue

dkurt reviewed Dec 22, 2023

View reviewed changes

modules/dnn/src/model.cpp

net.forward(outs, outNames);

}

void setPaddingValue(const float padValue_){

Copy link

Member

dkurt Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const can be avoided

dkurt reviewed Dec 22, 2023

View reviewed changes

asmorkalov modified the milestones: 4.9.0, 4.10.0 Dec 22, 2023

remove costum rescale

2899387

Abdurrahheem force-pushed the ash/yolo-objdec-api branch from 150b264 to 2899387 Compare December 22, 2023 12:54

This was referenced Dec 27, 2023

Changed onnx weights of yolov7 #24783

Merged

Changed onnx weights of yolov7 opencv/opencv_extra#1134

Merged

Abdurrahheem mentioned this pull request Jan 21, 2024

Documentation for Yolo usage in Opencv #24898

Merged

6 tasks

asmorkalov modified the milestones: 4.10.0, 4.11.0 Apr 27, 2024

asmorkalov marked this pull request as draft September 13, 2024 11:03

asmorkalov modified the milestones: 4.11.0, 5.0-release Dec 17, 2024

asmorkalov closed this Oct 17, 2025

	std::vector<Mat>& detections,
	const std::vector<Mat>& detections,

Uh oh!

Yolo Object Detection API #24691

Yolo Object Detection API #24691

Uh oh!

Conversation

Abdurrahheem commented Dec 12, 2023

Pull Request Readiness Checklist

Uh oh!

LaurentBerger commented Dec 12, 2023

Uh oh!

Abdurrahheem commented Dec 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LaurentBerger commented Dec 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Abdurrahheem commented Dec 13, 2023

Uh oh!

LaurentBerger commented Dec 13, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Abdurrahheem commented Dec 14, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkurt Dec 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asmorkalov commented Dec 15, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

opencv-alalek Dec 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Abdurrahheem commented Dec 12, 2023 •

edited

Loading

LaurentBerger commented Dec 13, 2023 •

edited

Loading

dkurt Dec 19, 2023 •

edited

Loading

opencv-alalek Dec 18, 2023 •

edited

Loading

Abdurrahheem Dec 22, 2023 •

edited

Loading

dkurt Dec 22, 2023 •

edited

Loading