Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions modules/dnn/src/layers/fully_connected_layer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -455,13 +455,6 @@ class FullyConnectedLayerImpl CV_FINAL : public InnerProductLayer
ret = false;
break;
}

if (!use_half && bias && (outerSize > 1))
{
UMat biasOnesMat = UMat::ones(outerSize, 1, umat_blobs[0].type());
UMat& biases = umat_blobs[1];
cv::gemm(biasOnesMat, biases, 1, dstMat, 1, dstMat, 0);
}
}

if (ret) return true;
Expand Down
20 changes: 12 additions & 8 deletions modules/dnn/src/ocl4dnn/src/ocl4dnn_inner_product.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -97,15 +97,19 @@ bool OCL4DNNInnerProduct<Dtype>::Forward(const UMat& bottom,
max_image_size);
}

if (use_half_ && bias_term_)
{
UMat biasOneMat = UMat::ones(M_, 1, CV_32F);
UMat newbias, tmpTop;
if (bias_term_) {
if (use_half_) {
UMat biasOneMat = UMat::ones(M_, 1, CV_32F);
UMat newbias, tmpTop;

convertFp16(bias, newbias);
convertFp16(top, tmpTop);
cv::gemm(biasOneMat, newbias, 1, tmpTop, 1, tmpTop, 0);
convertFp16(tmpTop, top);
convertFp16(bias, newbias);
convertFp16(top, tmpTop);
cv::gemm(biasOneMat, newbias, 1, tmpTop, 1, tmpTop, 0);
convertFp16(tmpTop, top);
} else {
UMat biasOnesMat = UMat::ones(M_, 1, CV_32F);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that correct to use ones for FP32 too? BTW, can you remind why ones were used for FP16?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that correct to use ones for FP32 too?

If I am not mistaking, FP16 data is casted back to FP32 to call cv::gemm(). This is done for FP16 and it should also work for FP32.

can you remind why ones were used for FP16?

To use cv::gemm() for bias addition with the assumption that bias has shape [N] or [1, N].

Y = alpha * A * B + beta * C
=> alpha = beta = 1, Y = A * B + C
=> A=ones<M, 1>, B=bias<1, N>, Y = bias<M, N> + C<M, N>

It does not work if bias has shape [M, 1] or [M, N]. But OCL4DNNInnerProduct is only used in InnerProduct layer in fully_connected_layer.cpp for now.


I will open another pull request adding opencl backend implementation for Gemm layer in gemm_layer.cpp.

cv::gemm(biasOnesMat, bias, 1, top, 1, top, 0);
}
}

return ret;
Expand Down