Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@mshabunin
Copy link
Contributor

Fixes performance of convertTo with multiplication on RISC-V RVV 0.7.1 (tested on LicheePi 4A with Xuantie 2.8.0 toolchain):

  • use madd/fmadd (a = a * b + c) instead of macc/fmacc (c = a * b + c) - it helps to use v-registers efficiently
  • modified v_load/v_store - for some reason 8-bit load/store instructions were used - avoid type transformations and extra instructions being added

Some tests slowed down (up to 5 times), but about a quarter of core tests improved performance up to 10 times. Similar picture with imgproc, but about a half of tests improved up to 6 times. I'm going to improve other v_load intrinsics and investigate ways to fix slowed cases.

@asmorkalov asmorkalov self-requested a review January 29, 2024 14:32
@asmorkalov asmorkalov self-assigned this Jan 29, 2024
@asmorkalov asmorkalov added this to the 4.10.0 milestone Jan 29, 2024
@asmorkalov asmorkalov merged commit 54b7caf into opencv:4.x Jan 29, 2024
Copy link
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mshabunin mshabunin deleted the fix-rvv07-scale64f branch January 29, 2024 16:19
This was referenced Feb 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants