Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jan 8, 2024

This PR enables PGO for castclass/isinst by default. Since we now have a dedicated tier for instrumentation it should not be a big deal to enable more probes IMO. Profiled casts seem to help LINQ for small inputs since LINQ is typically full of type casts for fast paths (e.g. this).

Benchmark:

[HideColumns("Error", "Job", "StdDev", "Median", "RatioSD")]
public class LinqBenchmarks
{
    IEnumerable<int> _enum1 = new int[10];
    IEnumerable<int> _enum2 = new int[10];

    [Benchmark]
    public bool SequenceEqual() => _enum1.SequenceEqual(_enum2);

    [Benchmark]
    public int ElementAt() => _enum1.ElementAt(5);

    [Benchmark]
    public int Count() => _enum1.Count();

    [Benchmark]
    public List<int> ToList() => _enum1.ToList();
}
Method Toolchain Mean Ratio
SequenceEqual \Core_Root\corerun.exe 9.372 ns 2.96
SequenceEqual \Core_Root_PR\corerun.exe 3.171 ns 1.00
ElementAt \Core_Root\corerun.exe 7.163 ns 4.69
ElementAt \Core_Root_PR\corerun.exe 1.529 ns 1.00
Count \Core_Root\corerun.exe 3.482 ns 3.14
Count \Core_Root_PR\corerun.exe 1.127 ns 1.00
ToList \Core_Root\corerun.exe 16.334 ns 1.30
ToList \Core_Root_PR\corerun.exe 12.332 ns 1.00

PS: SequenceEqual should benefit from #96571 too

Codegen example

// Promote IsList to Tier1 with PGO
for (int i = 0; i < 200; i++)
{
    IsList(new int[10]); // source is always int[]
    Thread.Sleep(10);
}

[MethodImpl(MethodImplOptions.NoInlining)]
bool IsList<T>(IEnumerable<T> source) => source is IList<T>;

Codegen diff (Main vs PR) for IsList:

; Assembly listing for method IsList
G_M12490_IG01:
       sub      rsp, 40
G_M12490_IG02:
+      mov      rax, rcx
+      test     rax, rax                 ; if (source == null)
+      je       SHORT G_M12490_IG05
+G_M12490_IG03:
+      mov      rdx, 0x7FF941DE9750
+      cmp      qword ptr [rax], rdx
+      je       SHORT G_M12490_IG05      ; if (source is int[])
+G_M12490_IG04:
       mov      rdx, rcx
       mov      rcx, 0x7FF942439A60
       call     CORINFO_HELP_ISINSTANCEOFINTERFACE ; slow fallback
+G_M12490_IG05:
       test     rax, rax
       setne    al
       movzx    rax, al
-G_M12490_IG03:
+G_M12490_IG06:
       add      rsp, 40
       ret      
; Total bytes of code 59

@AndyAyersMS @dotnet/jit-contrib do you think this should go through an internal run of dotnet/performance first?
I tested impact on TE (no impact on startup and rps) and a few desktop app. Also, I did a run of 250 LINQ benchmarks in dotnet/performance (10% of them improved, none regressed).

@ghost ghost assigned EgorBo Jan 8, 2024
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 8, 2024
@ghost
Copy link

ghost commented Jan 8, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR enables PGO for castclass/isinst by default. Since we now have a dedicated tier for instrumentation it's no longer a big deal to enable more probes IMO. Profiled casts seem to help a lot to LINQ for small inputs since LINQ is typically full of type casts for fast paths (e.g. this).

Benchmark:

IEnumerable<int> _enum1 = new int[10];
IEnumerable<int> _enum2 = new int[10];

[Benchmark]
public bool SequenceEqual() => _enum1.SequenceEqual(_enum2);

[Benchmark]
public int ElementAt() => _enum1.ElementAt(5);

[Benchmark]
public int Count() => _enum1.Count();
Method Job Toolchain Mean Ratio
SequenceEqual Job-YKPWSW \Core_Root\corerun.exe 9.372 ns 2.96
SequenceEqual Job-AYWKCJ \Core_Root_PR\corerun.exe 3.171 ns 1.00
ElementAt Job-YKPWSW \Core_Root\corerun.exe 7.163 ns 4.69
ElementAt Job-AYWKCJ \Core_Root_PR\corerun.exe 1.529 ns 1.00
Count Job-YKPWSW \Core_Root\corerun.exe 3.482 ns 3.14
Count Job-AYWKCJ \Core_Root_PR\corerun.exe 1.127 ns 1.00

Codegen example

// Promote IsList to Tier1 with PGO
for (int i = 0; i < 200; i++)
{
    IsList(new int[10]); // source is always int[]
    Thread.Sleep(10);
}

[MethodImpl(MethodImplOptions.NoInlining)]
bool IsList<T>(IEnumerable<T> source) => source is IList<T>;
; Assembly listing for method IsList
G_M12490_IG01:
       sub      rsp, 40
G_M12490_IG02:
+      mov      rax, rcx
+      test     rax, rax                 ; if (source == null)
+      je       SHORT G_M12490_IG05
+G_M12490_IG03:
+      mov      rdx, 0x7FF941DE9750
+      cmp      qword ptr [rax], rdx
+      je       SHORT G_M12490_IG05      ; if (source is int[])
+G_M12490_IG04:
       mov      rdx, rcx
       mov      rcx, 0x7FF942439A60
       call     CORINFO_HELP_ISINSTANCEOFINTERFACE ; slow fallback
+G_M12490_IG05:
       test     rax, rax
       setne    al
       movzx    rax, al
-G_M12490_IG03:
+G_M12490_IG06:
       add      rsp, 40
       ret      
; Total bytes of code 59

@AndyAyersMS do you think this should go through an internal run of dotnet/performance first?
I tested impact on TE (no impact on startup and rps) and a few desktop app. Also, I did a run of 250 LINQ benchmarks in dotnet/performance (10% of them improved, none regressed).

Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo
Copy link
Member Author

EgorBo commented Jan 8, 2024

/azp run runtime-coreclr pgo, runtime-coreclr libraries-pgo, runtime-coreclr pgostress

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@stephentoub
Copy link
Member

Thanks. If this is isinst/castclass only, I assume it doesn't cover a pattern like this?

// Use `GetType() == typeof(...)` rather than `is` to avoid cast helpers. This is measurably cheaper
// but does mean we could end up missing some rare cases where we could get a span but don't (e.g. a uint[]
// masquerading as an int[]). That's an acceptable tradeoff. The Unsafe usage is only after we've
// validated the exact type; this could be changed to a cast in the future if the JIT starts to recognize it.
// We only pay the comparison/branching costs here for super common types we expect to be used frequently
// with LINQ methods.
bool result = true;
if (source.GetType() == typeof(TSource[]))
{
span = Unsafe.As<TSource[]>(source);
}
else if (source.GetType() == typeof(List<TSource>))
{
span = CollectionsMarshal.AsSpan(Unsafe.As<List<TSource>>(source));
}
else
{
span = default;
result = false;
}

(or is that covered separately?)

@EgorBo
Copy link
Member Author

EgorBo commented Jan 8, 2024

Thanks. If this is isinst/castclass only, I assume it doesn't cover a pattern like this?

// Use `GetType() == typeof(...)` rather than `is` to avoid cast helpers. This is measurably cheaper
// but does mean we could end up missing some rare cases where we could get a span but don't (e.g. a uint[]
// masquerading as an int[]). That's an acceptable tradeoff. The Unsafe usage is only after we've
// validated the exact type; this could be changed to a cast in the future if the JIT starts to recognize it.
// We only pay the comparison/branching costs here for super common types we expect to be used frequently
// with LINQ methods.
bool result = true;
if (source.GetType() == typeof(TSource[]))
{
span = Unsafe.As<TSource[]>(source);
}
else if (source.GetType() == typeof(List<TSource>))
{
span = CollectionsMarshal.AsSpan(Unsafe.As<List<TSource>>(source));
}
else
{
span = default;
result = false;
}

(or is that covered separately?)

This code looks like it's already devirtualized by hands so there is nothing JIT can do additionally here. Although, you can convert it back to is TSource[] arr if you want (but it might regress NAOT that typically doesn't have any PGO data and Mono).

One benefit of is TSource[] arr would be that uint[]/int[] case 🙂

@stephentoub
Copy link
Member

you can convert it back to is TSource[] arr if you want
One benefit of is TSource[] arr would be that uint[]/int[] case 🙂

It was written this way to explicitly avoid the variance costs

@EgorBo EgorBo marked this pull request as ready for review January 8, 2024 09:54
@EgorBo EgorBo requested a review from AndyAyersMS January 8, 2024 09:55
@EgorBo
Copy link
Member Author

EgorBo commented Jan 8, 2024

PGO pipelines fail with #96612 (unrelated issue)

@AndyAyersMS
Copy link
Member

As a follow up you might look at whether cloning can profitably key off of one of these casts. Right now type test cloning requires OMF_HAS_GUARDEDDEVIRT.

@EgorBo
Copy link
Member Author

EgorBo commented Jan 8, 2024

As a follow up you might look at whether cloning can profitably key off of one of these casts. Right now type test cloning requires OMF_HAS_GUARDEDDEVIRT.

Ah good point, will file an issue for it!

@github-actions github-actions bot locked and limited conversation to collaborators Mar 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants