Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

tannergooding
Copy link
Member

This is a small follow up to #68874 but provides decent improvements, particularly where the same constant (Zero and AllBitsSet in particular) is repeatedly used.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 2, 2022
@ghost ghost assigned tannergooding Jun 2, 2022
@ghost
Copy link

ghost commented Jun 2, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

This is a small follow up to #68874 but provides decent improvements, particularly where the same constant (Zero and AllBitsSet in particular) is repeatedly used.

Author: tannergooding
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

Copy link
Contributor

@TIHan TIHan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and easy change.

@tannergooding
Copy link
Member Author

Example diffs

Note that ymm7 is set to "all bits set" first, and then reused on the last line rather than also using ymm6

        vpcmpeqd ymm7, ymm7, ymm7
        vxorps   ymm6, ymm7, ymm6
        vcmpps   ymm4, ymm4, ymm0, 0
        vorps    ymm4, ymm4, ymm6
-       vpcmpeqd ymm6, ymm6, ymm6
-       vpcmpeqd ymm4, ymm4, ymm6
+       vpcmpeqd ymm4, ymm4, ymm7

Note how ymm2 isn't reloaded with reloc @RDW00 because it was already in ymm2:

        vmovupd  ymm2, ymmword ptr[reloc @RWD00]
        vpshufb  ymm0, ymm0, ymm2
        vperm2i128 ymm0, ymm0, ymm0, 1
-       vmovupd  ymm2, ymmword ptr[reloc @RWD00]
        vpshufb  ymm1, ymm1, ymm2

case GT_CNS_INT:
case GT_CNS_LNG:
case GT_CNS_DBL:
case GT_CNS_VEC:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting ARM64 already had a path here since it needed special handling for Zero and AllBitsSet, it just wasn't getting used in LSRA

@tannergooding
Copy link
Member Author

@kunalspathak could you take a look? Its super small (8 lines) but touches LSRA so I'd appreciate your sign-off as well.

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants