Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

jakobbotsch
Copy link
Member

Allow forward sub to move a tree past another tree that throws an exception provided that they both throw the same exception.

I hit a regression in #85569 where this would have helped.

@ghost ghost assigned jakobbotsch May 2, 2023
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 2, 2023
@ghost
Copy link

ghost commented May 2, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Allow forward sub to move a tree past another tree that throws an exception provided that they both throw the same exception.

I hit a regression in #85569 where this would have helped.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch jakobbotsch changed the title JIT: Allow forward sub to reorder trees throwing same exceptions JIT: Allow forward sub to reorder trees throwing the same single exception May 2, 2023
@jakobbotsch jakobbotsch closed this May 2, 2023
@jakobbotsch jakobbotsch reopened this May 2, 2023
@jakobbotsch jakobbotsch marked this pull request as ready for review May 3, 2023 10:48
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @markples @AndyAyersMS

Diffs.

Regressions because:

  1. The usual fewer locals => less natural interval splitting for LSRA. When write barriers are involved this can sometimes lead to unfortunate spills if there is a local in one of the args that the write barrier is going to kill.
  2. Some reorderings cause us to no longer be able to contain a source operand, e.g. before:
N039 (  3,  2) [000010] -----------                   t10 =    LCL_VAR   long   V01 loc0         u:1 rsi (last use) REG rsi <l:$340, c:$380>
N041 (  1,  1) [000011] ----------z                   t11 =    LCL_VAR   ref    V00 this         u:1 rdi REG rdi $80
                                                            ┌──▌  t11    ref    
N043 (  2,  2) [000050] -c---------                   t50 =   LEA(b+16) byref  REG NA
                                                            ┌──▌  t50    byref  
N045 (  4,  4) [000012] nc--GO-----                   t12 =   IND       long   REG NA <l:$3c1, c:$3c2>
                                                            ┌──▌  t10    long   
                                                            ├──▌  t12    long   
N047 (  8,  7) [000013] ----GO-----                   t13 =   SUB       long   REG rsi <l:$3c6, c:$3c5>
                                                            ┌──▌  t13    long   
N049 (  3,  2) [000014] DA--GO-N---                           STORE_LCL_VAR long   V02 loc1         d:1 rsi REG rsi $VN.Void
N051 (???,???) [000078] -----------                            IL_OFFSET void   INLRT @ 0x017[E-] REG NA
N053 (  1,  1) [000017] -----------                   t17 =    LCL_VAR   ref    V00 this         u:1 rdi REG rdi $80
                                                            ┌──▌  t17    ref    
N055 (  2,  2) [000054] -c---------                   t54 =   LEA(b+8)  byref  REG NA
                                                            ┌──▌  t54    byref  
N057 (  4,  4) [000018] nc--GO-----                   t18 =   IND       long   REG NA <l:$3c8, c:$3c9>
N059 (  3,  2) [000019] -----------                   t19 =    LCL_VAR   long   V02 loc1         u:1 rsi (last use) REG rsi <l:$3c3, c:$3c4>
                                                            ┌──▌  t18    long   
                                                            ├──▌  t19    long   
N061 (  8,  7) [000020] ----GO-----                   t20 =   ADD       long   REG rsi <l:$3cd, c:$3cc>

after (notice no containment flag on [000018])

N039 (  1,  1) [000017] ----------z                   t17 =    LCL_VAR   ref    V00 this         u:1 rdi REG rdi $80
                                                            ┌──▌  t17    ref    
N041 (  2,  2) [000052] -c---------                   t52 =   LEA(b+8)  byref  REG NA
                                                            ┌──▌  t52    byref  
N043 (  4,  4) [000018] n---GO-----                   t18 =   IND       long   REG rbx <l:$3c1, c:$3c2>
N045 (  3,  2) [000010] -----------                   t10 =    LCL_VAR   long   V01 loc0         u:1 rsi (last use) REG rsi <l:$340, c:$380>
N047 (  1,  1) [000011] -----------                   t11 =    LCL_VAR   ref    V00 this         u:1 rdi REG rdi $80
                                                            ┌──▌  t11    ref    
N049 (  2,  2) [000054] -c---------                   t54 =   LEA(b+16) byref  REG NA
                                                            ┌──▌  t54    byref  
N051 (  4,  4) [000012] nc--GO-----                   t12 =   IND       long   REG NA <l:$3c4, c:$3c5>
                                                            ┌──▌  t10    long   
                                                            ├──▌  t12    long   
N053 (  8,  7) [000013] ----GO-----                   t13 =   SUB       long   REG rsi <l:$3c9, c:$3c8>
                                                            ┌──▌  t18    long   
                                                            ├──▌  t13    long   
N055 ( 13, 12) [000020] ----GO-----                   t20 =   ADD       long   REG rsi <l:$3cd, c:$3cc>

This seems fixable by giving backend interference the same treatment as this, but I will look into that separately. (And unfortunately these have GTF_ORDER_SIDEEFF so we won't be able to handle this case).

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

}

m_accumulatedFlags |= (node->gtFlags & GTF_GLOB_EFFECT);
if ((node->gtFlags & GTF_CALL) != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems odd that OperExceptions doesn't really handle GT_CALL. I wonder if there's any gain to be had by allowing some helper calls to forward sub... probably not much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional when I first added this API. It's really because of what @markples mentions below: OperExceptions does not make sense for user calls. The return value of ExceptionSetFlags::All indicates that something can throw all the built-in exceptions, but user calls can throw many more exceptions types. So to model that we would probably add some ExceptionSetFlags::UnknownException, but then the callers would need to either handle it specially or assert that they didn't see it.
The handling here by setting All is really just relying on forward-sub specific knowledge that the caller is going to handle that in the correct way, it's not actually correct to say that GTF_CALL nodes can throw those exceptions only.

I wonder if there's any gain to be had by allowing some helper calls to forward sub... probably not much.

Definitely something we could try. We would need to generalize things a bit here to track more precise effects, since at that point we are throwing away the "GTF_CALL means user call" assumption that the current interference checks in the caller are relying on. (But that might also make sense if we are going to end up with a GT_BOX that has GTF_CALL set on it in the future.)

return m_useFlags;
}

ExceptionSetFlags GetExceptions() const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth additional comments here and/or on m_useExceptions/m_accumulatedExceptions that a value with >1 exceptions isn't trustworthy to avoid some accidental misuse.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add that.

@markples
Copy link
Contributor

markples commented May 3, 2023

A thing that confused me for a bit in here is that the set of exceptions is just a small set of possible exceptions (ExceptionSetFlags). This becomes a bit obvious when seeing that CALL isn't handled (presumably throw is a call by this point), so this is really just about certain built-in exceptions for known operations.

... which after thinking more makes more sense because two user exceptions of the same type might be distinguishable by values that they contain (throw MyException(1) vs throw MyException(2))

... and perhaps raises questions about whether it's ok for the debugger to point to the "wrong" throw point for one of these. I'm guessing that is leeway that we allow ourselves but could be confusing ("how did I get to this point if the previous one didn't throw?")

@jakobbotsch
Copy link
Member Author

A thing that confused me for a bit in here is that the set of exceptions is just a small set of possible exceptions (ExceptionSetFlags). This becomes a bit obvious when seeing that CALL isn't handled (presumably throw is a call by this point), so this is really just about certain built-in exceptions for known operations.

Indeed, the importer transforms throw into helper calls to CORINFO_HELP_THROW (and marks the blocks as BBJ_THROW blocks)

... which after thinking more makes more sense because two user exceptions of the same type might be distinguishable by values that they contain (throw MyException(1) vs throw MyException(2))

... and perhaps raises questions about whether it's ok for the debugger to point to the "wrong" throw point for one of these. I'm guessing that is leeway that we allow ourselves but could be confusing ("how did I get to this point if the previous one didn't throw?")

Right, it does make the debugger experience worse -- but that's a general problem for forward sub. Another relevant note here is that forward sub is not the only place we do this, call args morphing will reorder arguments as well and uses the same trick to allow some profitable reorderings. In fact call args morphing used to reorder different exceptions as well, and the "same-exception-reordering" was a mitigation of the regressions as part of the fix.

@markples
Copy link
Contributor

markples commented May 3, 2023

... but that's a general problem for forward sub ...

or I suppose any code motion, tail merging, etc. It seems to be an accepted norm that "step next" can jump around in the source code, and indeed if one is doing that then the out-of-order behavior will hopefully be obvious. Break-on-throw and crash dumps give a more point-in-time view of things, so the program state and the execution order are less obviously tied together.

Other examples would be

if (x == 1)
{
    x = 2;
    y = a + b; // why is the debugger showing x==1 here?
}

or

if (b1) throw expr;
if (b2) throw expr;
blah;
return;

==> transformed to ==>

if (b1) goto e;
if (b2) goto e;
blah;
return;

e:
  throw expr // which conditional failed?

where expr being a bounds check exception is the common case. I can't remember if we ended up giving up on merging those in the past (even in optimized builds, almost certain that we kept them separate in debug builds).

Anyways, I'm not opposed to the change at this point, just continuing the discussion. Thanks for the other information about it.

@jakobbotsch
Copy link
Member Author

or

if (b1) throw expr;
if (b2) throw expr;
blah;
return;

==> transformed to ==>

if (b1) goto e;
if (b2) goto e;
blah;
return;

e:
  throw expr // which conditional failed?

where expr being a bounds check exception is the common case. I can't remember if we ended up giving up on merging those in the past (even in optimized builds, almost certain that we kept them separate in debug builds).

We do merge throw helpers when optimizing, e.g.

public class Program
{
    public static void Main()
    {
        Foo(int.MaxValue - 15, int.MaxValue - 5);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static int Foo(int a, int b)
    {
        if (checked(a + 10) < 11)
            return 0;

        if (checked(b + 10) < 11)
            return 1;

        return 2;
    }
}

results in

; Assembly listing for method Program:Foo(int,int):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  3,  3   )     int  ->  rcx         single-def
;  V01 arg1         [V01,T01] (  3,  2.50)     int  ->  rdx         single-def
;  V02 OutArgs      [V02    ] (  1,  1   )  struct (32) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;
; Lcl frame size = 40

G_M63574_IG01:  ;; offset=0000H
       sub      rsp, 40
                                                ;; size=4 bbWeight=1 PerfScore 0.25
G_M63574_IG02:  ;; offset=0004H
       add      ecx, 10
       jo       SHORT G_M63574_IG07
       cmp      ecx, 11
       jge      SHORT G_M63574_IG05
                                                ;; size=10 bbWeight=1 PerfScore 2.50
G_M63574_IG03:  ;; offset=000EH
       xor      eax, eax
                                                ;; size=2 bbWeight=0.50 PerfScore 0.12
G_M63574_IG04:  ;; offset=0010H
       add      rsp, 40
       ret
                                                ;; size=5 bbWeight=0.50 PerfScore 0.62
G_M63574_IG05:  ;; offset=0015H
       add      edx, 10
       jo       SHORT G_M63574_IG07
       mov      eax, 2
       mov      ecx, 1
       cmp      edx, 11
       cmovl    eax, ecx
                                                ;; size=21 bbWeight=0.50 PerfScore 1.12
G_M63574_IG06:  ;; offset=002AH
       add      rsp, 40
       ret
                                                ;; size=5 bbWeight=0.50 PerfScore 0.62
G_M63574_IG07:  ;; offset=002FH
       call     CORINFO_HELP_OVERFLOW
       int3
                                                ;; size=6 bbWeight=0 PerfScore 0.00

; Total bytes of code 53, prolog size 4, PerfScore 10.55, instruction count 18, allocated bytes for code 53 (MethodHash=53e907a9) for method Program:Foo(int,int):int
; ============================================================

where the two cases are not distinguishable. And indeed, same thing happens with bounds checks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants