Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Share implementation of ComWrappers between CoreCLR and NativeAOT #113907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

jkoritzinsky
Copy link
Member

@jkoritzinsky jkoritzinsky commented Mar 26, 2025

Share the managed implementation of ComWrappers from NativeAOT with CoreCLR.

Remove all parts of CoreCLR's native implementation that don't need to run during GC and provide a manually-managed implementation of the logic that must run during GC.

Reimplement DAC's ComWrappers support on top of the ConditionalWeakTables instead of using the entries in the syncblock.

Memory improvements for a synthetic benchmark of exposing 1 million objects to COM and consuming 1 million objects from COM:

Main Peak Working Set: 1,027,820 KiB
PR Peak Working Set: 779,712 KiB

Move portions of ComWrappers.NativeAot.cs that aren't run during GC into ComWrappers.Common.cs

Convert CoreCLR to use the managed ComWrappers implementation wherever possible (everywhere except for during GC)

Move wrapper ID to be a CoreCLR-only concept (as it's only needed for diagnostics support)
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 46 out of 48 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • src/coreclr/System.Private.CoreLib/System.Private.CoreLib.csproj: Language not supported
  • src/coreclr/nativeaot/System.Private.CoreLib/src/System.Private.CoreLib.csproj: Language not supported

@jkoritzinsky
Copy link
Member Author

@jkotas any more feedback?

@AaronRobinsonMSFT
Copy link
Member

I don't think micro benchmarks are going to be all that helpful so running against our own tests, that do a lot of validating wrapper clean-up, would help indicate the general impact.

Was the above done or are you looking for more feedback before doing that?

@jkoritzinsky
Copy link
Member Author

I wanted to make sure I had all feedback before I start running benchmarks.

I'll start on the benchmarking.

@jkoritzinsky
Copy link
Member Author

I did some local benchmarking with the ComWrappers/API test suite, as requested. I timed 10 runs of that test suite with the implementation in this PR and the implementation in main, both Release builds (10 runs to warm the processor cache, 10 runs for the measurements below) and got the following results:

  • PR: 2.5039942 seconds
  • Main: 2.5145364 seconds

@AustinWise
Copy link
Contributor

Here is what the references between different managed object wrapper objects now are. I added the new static conditional weak table on ComWrappers:

graph TD
WO[wrapped object];
MOW["ManagedObjectWrapper (unmanaged struct)"];
MOWH[ManagedObjectWrapperHolder];
Releaser[ManagedObjectWrapperReleaser];
WO -->|ComWrappers static conditional weak table| MOWH;
WO -->|ComWrappers instance conditional weak table| MOWH;
MOWH --> WO;
MOWH --> MOW;
MOW -->|ref-counted handle| MOWH;
Releaser --> MOW;
MOWH --> Releaser;
WO -->|SyncBlock| MOW;
Loading

Adding the static ConditionalWeakTable fixes the problem of the pointers in the SyncBlock becoming dangling. But it can cause a memory leak where there was not one before. Before this change, when a ComWrappers instance was collected, all of its associated managed object wrappers became eligible for collection (assuming all refcounts were zero). After this change, the static ConditionalWeakTable keeps every ManagedObjectWrapperHolder ever made for an object alive until the wrapped object is collected.

This program should illustrate the memory leak:

reproduction
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Runtime.InteropServices.Marshalling;

unsafe class Program
{
    static Foo sRootedFoo = new Foo();

    static void Main()
    {
        Console.WriteLine($"I'm pid {Environment.ProcessId}");

        while (true)
        {
            for (int i = 0; i < 1000; i++)
            {
                AllocateAndFreeComWrappers();
            }
            for (int i = 0; i < 5; i++)
            {
                GC.Collect();
                GC.WaitForPendingFinalizers();
            }
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static void AllocateAndFreeComWrappers()
    {
        ComWrappers cw = new StrategyBasedComWrappers();
        nint ptr = cw.GetOrCreateComInterfaceForObject(sRootedFoo, CreateComInterfaceFlags.None);
        Marshal.Release(ptr);
    }
}

[GeneratedComInterface]
[Guid("3faca0d2-e7f1-4e9c-82a6-404fd6e0aab8")]
internal partial interface IFoo
{
    void Method(int i);
}

[GeneratedComClass]
internal partial class Foo : IFoo
{
    public void Method(int i)
    {
    }
}

When run on .NET 9 NativeAOT, memory usage stays flat. On .NET 9 CoreCLR, it already leaks memory. So maybe this regression is not a big deal since it only effects NativeAOT?

@AaronRobinsonMSFT
Copy link
Member

On .NET 9 CoreCLR, it already leaks memory.

This is an intentional memory leak due to incorrect usage of ComWrappers. There is a strict mapping between the ComWrappers that created the wrapper and the wrapper itself. I don't think attempting to solve this is needed and I would argue the current CoreCLR implementation has less overhead than the native AOT solution with the additional ConditionalWeakTable.

@jkoritzinsky Thanks for trakcing the performance impact between the two implementations. Did you see the peak memory usage between the two?

@jkoritzinsky
Copy link
Member Author

I'll go check the peak memory usage and report back.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory usage

I would like to know memory cost of minimal ComWrapper RCW and CCW before and after the change. (e.g. create a 100,000 of these, store them in array, and check the memory delta)

@jkoritzinsky
Copy link
Member Author

jkoritzinsky commented Apr 21, 2025

I compared the peak working set for the unit tests and got the following results:

PR: 28860 KiB
Main: 28920 KiB

I also compared a program that just creates 10000 RCWs that wrap 10000 CCWs of 10000 objects and got the following results:

PR: 164812 KiB
Main: 127544 KiB

I'll investigate what causes the increase (if it's just the table I added for diagnostics or something else) and see if I can mitigate it.

@jkoritzinsky
Copy link
Member Author

Looks like I can get the memory usage down, but I'll need to do the diagnostics work I was hoping to put off until another PR.

For a program that creates 1 million RCWs that wrap 1 million CCWs of 1 million objects, I get the following results:

Main: 1,027,288 KiB
PR: 1,339,048 KiB
PR with no special diagnostics assistance code in the runtime (no storing in the syncblock for diagnostics, no map for object->all ManagedObjectWrapperHolders): 640,480 KiB
PR with no syncblock, but has managed map for object->all ManagedObjectWrapperHolders: 779,548 KiB

Overall, it looks like we can switch, but we can't have both the syncblock info and the ConditionalWeakTables (which I guess makes sense).

…es on the ComWrappers class instead of using the syncblock
@jkoritzinsky jkoritzinsky added the tenet-performance Performance related issue label May 1, 2025
@jkoritzinsky
Copy link
Member Author

I was able to get the diagnostics support switched over. I validated that the dumpobj, dumprcw, and dumpccw commands all work correctly with ComWrappers in this PR.

…he diagnostics-only table with the debugger support feature switch that already exists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants