Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Vectorize replace_copy for x64 / x86Β #5908

@AlexGuteniev

Description

@AlexGuteniev

Yes we have a low-hanging fruit in vectorization 🍎

Some time ago I discovered that the compiler only needs some help to auto vectorize replace_copy and replace_copy_if. The help was very minor, compared to the couple of other places where we also help the compiler auto-vectorize.

So the PR #4431 was open, and it was merged with only mild reluctance.

Shortly thereafter, the optimization ceased to work. I've filed DevCom-10895463. Unfortunately, it hasn't been fixed so far. The final VS 2022 ended up having replace_copy and replace_copy_if non-vectorized.

We cannot manually vectorize replace_copy_if, as it takes user's predicate, and there isn't a reasonable standard predicate to query against. (Well, actually there might be a way, but even if what I'm thinking of would work, you will not like it). Anyway, we can still vectorize replace_copy, and it is pretty easy thing to do.

Note that unlike assisted auto-vectorization, the manual vectorization condition has to be strict. Only contiguous iterators! And be sure to get the pointers properly, see #5683 and the linked issue and paper. Also a new approach to control macro should be used, but this part is hard to miss.

Unlike replace, it doesn't need to rely on AVX2 masked stores, it is classic byte blending, so it can work on any element size.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions