Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Question] GC pause time is too large for 60 fps #65850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gbalykov opened this issue Feb 24, 2022 · 25 comments
Open

[Question] GC pause time is too large for 60 fps #65850

gbalykov opened this issue Feb 24, 2022 · 25 comments

Comments

@gbalykov
Copy link
Member

We are investigating GC pause time in order to potentially achieve 60 fps, but we observe long GC EE suspend pauses.

For example, on next benchmark max GC pause on arm64(rpi4) can get as high as 55 ms with default GC setup (i.e. default latency mode, default max gen0/gen1 size). This is a huge pause, because 1 frame in 60 fps mode should fit in ~16 ms.

Some details:

  • GC pause time is computed as diff between GCRestartEEEnd and GCSuspendEEBegin events
  • SustainedLowLatency mode doesn't improve max GC pause time
  • LowLatency mode reduces max GC pause time, but max RSS increases signigicantly (on benchmark above ~10x increase in max RSS)
  • different values of COMPlus_GCGen0MaxBudget and COMPlus_GCGen1MaxBudget do not improve max GC pause time

Additionally, we've found the next measurements on coreclr/mono and results are pretty much the same: https://gist.github.com/jechter/2730225240163a806fcc15c44c5ac2d6.

Are there any plans to support GC mode with limited pause time?

using System;
using System.Threading;
using System.Collections.Generic;
using System.Diagnostics.Tracing;

namespace gctest2
{
    class Test
    {
        static void Main()
        {
            GCEventListener.Start();
            Test t = new Test();
            t.mGCTest();
        }

        struct Log
        {
            public String log;
        }

        private static char[] base_string =  "GC  TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST COUNT".ToCharArray();

        public void mGCTest()
        {
            int count = 0;
            int max = 102400;
            
            List<Log> list = new List<Log>();

            for (int index1 = 0; index1 < 2000; ++index1)
			{
				for (int index2 = 0; index2 < 1000; ++index2)
				{
					String sd = $"GC  TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST COUNT {count} / {max}";
					Log log = new Log();
					log.log = sd;
					list.Add(log);

					if (list.Count == max)
					{
						while (list.Count > 0)
						{
							list.RemoveAt(0);
						}
					}
					count++;
				}
            }
        }
    }

    public sealed class GCEventListener : EventListener
    {
        private static GCEventListener instance = null;
        private long timeGCStart = 0;
        private bool verbose = false;

        private GCEventListener()
        {
            Console.WriteLine("GCEventListener Created");
        }

        public static void Start(bool verbose = false)
        {
            if(instance == null)
                instance = new GCEventListener();
        }

        // Called whenever an EventSource is created.
        protected override void OnEventSourceCreated(EventSource eventSource)
        {
            // Watch for the .NET runtime EventSource and enable all of its events.
            if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
            {
                EnableEvents(eventSource, EventLevel.Informational, (EventKeywords)0x1);
            }
        }

        private void PrintEventData(EventWrittenEventArgs eventData)
        {
            Console.WriteLine($"ThreadID = {eventData.OSThreadId} ID = {eventData.EventId} Name = {eventData.EventName} Tick = {eventData.TimeStamp.Ticks/10.0/1000.0}");

            for (var i = 0; i < eventData.Payload.Count; i++)
            {
                string payloadString = eventData.Payload[i] != null ? eventData.Payload[i].ToString() : string.Empty;
                Console.WriteLine($"\tName = \"{eventData.PayloadNames[i]}\" Value = \"{payloadString}\"");
            }

            Console.WriteLine("\n");
        }

        // Called whenever an event is written.
        protected override void OnEventWritten(EventWrittenEventArgs eventData)
        {
			PrintEventData(eventData);
            if (eventData.EventName.Contains("Concurrent"))
            {
                PrintEventData(eventData);
            }
            else if (eventData.EventName.Contains("GCStart"))
            {
                timeGCStart = eventData.TimeStamp.Ticks;
                //PrintEventData(eventData);
            }
            else if (eventData.EventName.Contains("GCEnd"))
            {
                long timeGCEnd = eventData.TimeStamp.Ticks;
                long gcIndex = long.Parse(eventData.Payload[0].ToString());
                Console.WriteLine("GC#{0} took {1:f3}ms for generation {2}", gcIndex, (double) (timeGCEnd - timeGCStart)/10.0/1000.0, eventData.Payload[1]);
                //PrintEventData(eventData);
            }
        }
	}
}

cc @alpencolt @jkotas

@ghost
Copy link

ghost commented Feb 24, 2022

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Feb 24, 2022
@ghost
Copy link

ghost commented Feb 24, 2022

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

We are investigating GC pause time in order to potentially achieve 60 fps, but we observe long GC EE suspend pauses.

For example, on next benchmark max GC pause on arm64(rpi4) can get as high as 55 ms with default GC setup (i.e. default latency mode, default max gen0/gen1 size). This is a huge pause, because 1 frame in 60 fps mode should fit in ~16 ms.

Some details:

  • GC pause time is computed as diff between GCRestartEEEnd and GCSuspendEEBegin events
  • SustainedLowLatency mode doesn't improve max GC pause time
  • LowLatency mode reduces max GC pause time, but max RSS increases signigicantly (on benchmark above ~10x increase in max RSS)
  • different values of COMPlus_GCGen0MaxBudget and COMPlus_GCGen1MaxBudget do not improve max GC pause time

Additionally, we've found the next measurements on coreclr/mono and results are pretty much the same: https://gist.github.com/jechter/2730225240163a806fcc15c44c5ac2d6.

Are there any plans to support GC mode with limited pause time?

using System;
using System.Threading;
using System.Collections.Generic;
using System.Diagnostics.Tracing;

namespace gctest2
{
    class Test
    {
        static void Main()
        {
            GCEventListener.Start();
            Test t = new Test();
            t.mGCTest();
        }

        struct Log
        {
            public String log;
        }

        private static char[] base_string =  "GC  TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST COUNT".ToCharArray();

        public void mGCTest()
        {
            int count = 0;
            int max = 102400;
            
            List<Log> list = new List<Log>();

            for (int index1 = 0; index1 < 2000; ++index1)
			{
				for (int index2 = 0; index2 < 1000; ++index2)
				{
					String sd = $"GC  TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST COUNT {count} / {max}";
					Log log = new Log();
					log.log = sd;
					list.Add(log);

					if (list.Count == max)
					{
						while (list.Count > 0)
						{
							list.RemoveAt(0);
						}
					}
					count++;
				}
            }
        }
    }

    public sealed class GCEventListener : EventListener
    {
        private static GCEventListener instance = null;
        private long timeGCStart = 0;
        private bool verbose = false;

        private GCEventListener()
        {
            Console.WriteLine("GCEventListener Created");
        }

        public static void Start(bool verbose = false)
        {
            if(instance == null)
                instance = new GCEventListener();
        }

        // Called whenever an EventSource is created.
        protected override void OnEventSourceCreated(EventSource eventSource)
        {
            // Watch for the .NET runtime EventSource and enable all of its events.
            if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
            {
                EnableEvents(eventSource, EventLevel.Informational, (EventKeywords)0x1);
            }
        }

        private void PrintEventData(EventWrittenEventArgs eventData)
        {
            Console.WriteLine($"ThreadID = {eventData.OSThreadId} ID = {eventData.EventId} Name = {eventData.EventName} Tick = {eventData.TimeStamp.Ticks/10.0/1000.0}");

            for (var i = 0; i < eventData.Payload.Count; i++)
            {
                string payloadString = eventData.Payload[i] != null ? eventData.Payload[i].ToString() : string.Empty;
                Console.WriteLine($"\tName = \"{eventData.PayloadNames[i]}\" Value = \"{payloadString}\"");
            }

            Console.WriteLine("\n");
        }

        // Called whenever an event is written.
        protected override void OnEventWritten(EventWrittenEventArgs eventData)
        {
			PrintEventData(eventData);
            if (eventData.EventName.Contains("Concurrent"))
            {
                PrintEventData(eventData);
            }
            else if (eventData.EventName.Contains("GCStart"))
            {
                timeGCStart = eventData.TimeStamp.Ticks;
                //PrintEventData(eventData);
            }
            else if (eventData.EventName.Contains("GCEnd"))
            {
                long timeGCEnd = eventData.TimeStamp.Ticks;
                long gcIndex = long.Parse(eventData.Payload[0].ToString());
                Console.WriteLine("GC#{0} took {1:f3}ms for generation {2}", gcIndex, (double) (timeGCEnd - timeGCStart)/10.0/1000.0, eventData.Payload[1]);
                //PrintEventData(eventData);
            }
        }
	}
}

cc @alpencolt @jkotas

Author: gbalykov
Assignees: -
Labels:

area-GC-coreclr, untriaged

Milestone: -

@jkotas
Copy link
Member

jkotas commented Feb 24, 2022

The current CoreCLR GC does not have guaranteed upper bound on the pause times. The pause times can the significant if the application creates patterns unfriendly to generational GC like long linked list or a lot of Gen2 -> Gen0 references.

Have you analyzed where the time is spent in the GC for this microbenchmark? Looking at the microbenchmark code, I think large number of Gen2 -> Gen0 references may be contributing be the significant pause times.

Are there any plans to support GC mode with limited pause time?

I am not aware of plans to build a GC mode with max pause time that is guaranteed even when the application uses patterns that are unfriendly to GC. It would likely need to be a very different GC from what we have today. I think we would be happy to accept a GC implementation like that into dotnet/runtime if somebody builds it.

@EgorBo
Copy link
Member

EgorBo commented Feb 24, 2022

different values of COMPlus_GCGen0MaxBudget and COMPlus_GCGen1MaxBudget do not improve max GC pause time

weird, I'd expect smaller gen0 to lead to more frequent GCs but smaller pauses overall (disclaimer: not a gc expert)
However, due to the L3 issue I assume the default Gen0 for your Pi4 is already set to some really small value like 256kb (see #64645). Have you tried to play with DOTNET_GCgen0size= instead? (just in case - these config switches accept numbers in HEX format)

@jkotas
Copy link
Member

jkotas commented Feb 24, 2022

weird, I'd expect smaller gen0 to lead to more frequent GCs but smaller pauses overall (disclaimer: not a gc expert)

That's true only for programs that behave according to the simplifying assumptions that the generational GC designs are based on. The example above violates these simplifying assumptions. It ends up with a large Gen2 array that has to be scanned by Gen0 GC every time since it may contain Gen0 back references.

Here is a simpler example that amplifies the problem. It has Gen0 GC pause times around 0.3 seconds on my desktop machine. Changing the Gen0 budged does not have affect on the GC pause times.

object[] a = new object[100_000_000];

for (;;)
{
    // Create a lot of Gen2 -> Gen0 references to keep the GC busy
    object o = new object();
    for (int i = 0; i < a.Length; i++)
    {
        a[i] = o;
    }
    // Create a lot of short lived objects to trigger Gen0 GC
    for (int i = 0; i < 1000; i++)
    {
        GC.KeepAlive(new string('a', 10000));
    }
}

BTW: It is not unusual to see these kind of anti-patterns in "high-performance" code that tries to cache and reuse data structures to "save" the GC work.

@Maoni0
Copy link
Member

Maoni0 commented Feb 25, 2022

before we provide soft real time guarantee (which is a lot of work as you can imagine), I was actually thinking of provide a mode for our GC with no generations just for workloads with no generational behavior at all. so every GC is a full GC. I hacked up a version to do this tonight 😄 with @jkotas's code I get this

comparison

the pause time in single gen is actually very tiny, but there's one BGC that has that max pause which made the avg pause 8.1ms instead of < 1ms -

image

what I have right now is very hacky. but if anyone is interested, I can see if I could get some time in the not too distant future make it reasonable quality to become an experimental feature (ie, can be turned on with a config).

@huoyaoyuan
Copy link
Member

GC pause time should be related to the CPU power. On PC processor, getting 600fps with SubstainedLowLatency is achievable.
Anyway, reducing GC pressure should be beneficial.

@gbalykov
Copy link
Member Author

Thanks for all the answers! Benchmark that I was testing is a sample GC stress test benchmark and I agree that it may not represent optimized behavior in terms of GC, but we wanted to see how GC behaves in this mode too.

Have you tried to play with DOTNET_GCgen0size= instead?

I've only tested different values of gen0_max_size and gen1_max_size (with which static_data_table[i] is filled), and on rpi4 gen0_min_size=min(1Mb, gen0_max_size). Also I've been testing from COMPlus_GCGen0MaxBudget=10000 (64 Kb) to COMPlus_GCGen0MaxBudget=10000000 (256 Mb) with some step, and max GC pause time varies but always stays above ~30 ms.

I'd expect smaller gen0 to lead to more frequent GCs but smaller pauses overall

This is true that average GC pause reduces significantly (to ~1-2ms with COMPlus_GCGen0MaxBudget=10000) and frequency of GCs increases significantly (more than x100), but max GC pause time still remains above ~30 ms.

what I have right now is very hacky. but if anyone is interested, I can see if I could get some time in the not too distant future make it reasonable quality to become an experimental feature (ie, can be turned on with a config).

@Maoni0 This looks very interesting, can you share more details on how to make this hack myself or maybe even patch, so that I can experiment with it?

@Maoni0
Copy link
Member

Maoni0 commented Feb 25, 2022

@gbalykov I need to do a bit more work on it - if I can get some time to work on this next week I'll give you a commit to try.

@Maoni0
Copy link
Member

Maoni0 commented Feb 25, 2022

@gbalykov also for the occasional long pause you are seeing right now, I would suggest 2 things -

does it show that the Promote MB is also high for those GCS? if so could you please try the generation aware analysis that we introduced in .NET 5? it would help us figure out what's causing the more promoted bytes.

if the Promoted is the same yet the GC time is longer, you could also capture a trace with CPU samples with perfview if you are on windows. I have a commandline in mem-doc (search for StopOnGCOverMSec).

@gbalykov
Copy link
Member Author

On the benchmark that I've shared above TotalPromotedSize0, TotalPromotedSize1 and TotalPromotedSize2 can get quite large, however, TotalPromotedSize0 is large almost for all GCs, even with small pauses. As far as I can tell large pauses mostly come from large TotalPromotedSize1 and TotalPromotedSize2. I think that @jkotas is right that this benchmark is GC unfriendly. I'll try to check generation aware analysis, thanks.

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Feb 28, 2022
@mangod9 mangod9 added this to the Future milestone Feb 28, 2022
@smoogipoo
Copy link

Also going to link #48937 here, where we've experienced Gen1 collection times greater than the 16ms window of 60fps, which is a bare minimum for us.

In my testing there, we didn't see what I would consider a "large amount" of promotion: https://user-images.githubusercontent.com/1329837/118099109-c0f93a00-b40f-11eb-80ae-db3ccf4a39a0.png as what's shown in this issue. I've forgotten how many times I've read through @Maoni0 's GC doc all to come to the conclusion that we're not doing anything out of the ordinary.

I was actually thinking of provide a mode for our GC with no generations just for workloads with no generational behavior at all

I would also be very interested and am willing to test this even off the main branch.

@aaron-manning
Copy link

aaron-manning commented Sep 18, 2022

+1

The lack of a low-latency GC option makes C#/F# sub-optimal for low-latency projects, eg high-frequency-trading.
Seems like a big hole in an otherwise great runtime.

@gbalykov
Copy link
Member Author

@Maoni0 did you have a chance to work on GC with no generations?

@Maoni0
Copy link
Member

Maoni0 commented Nov 16, 2022

I did, but only a little bit :( I got it to kind of run but clearly there were bugs. I'm hoping to get more time to work on this now that we finished .net 7. this seems like a good project for the holidays...

@roozbehid
Copy link
Contributor

+1

I also need a consistent 20ms processing and was doing a research on if it is achievable with c# or not.
It seems it is not yet. Hopefully. Net 8!

@gbalykov
Copy link
Member Author

@Maoni0 hi, did you have a chance to continue your work on GC with no generations?

@AlgorithmsAreCool
Copy link
Contributor

Adding another voice, I am currently developing a latency sensitive application and I'm also interested in the above discussion of a generationless GC as a potential option.

@Yen
Copy link

Yen commented Oct 8, 2023

Not experienced with the inner workings of GCs and can't talk to the complexities to such an approach, but I would like a way to have more control over the GC without requiring hard guarantees. I am building a UI platform (not a game engine where a dropped frame is the end of the world) and would like to have ways to potentially help avoid large GC pauses if I can instead perform smaller ones more often. For example, a call to GC.Collect with a target max duration would seem like an idea API for these scenarios where I can calculate through runtime heuristics how long I have free before a next frame might want to be displayed, and as such I can tell the garbage collector it has some time to do anything it might want to. I understand there is a lot going on in a full GC pause that can't be easily broken up, and as such for something like a game engine solving this is hard, but having the option get some smaller bits done when I know there is time to do so, in hopes that it would make a full GC pause slightly less regular, would be a nice to have for any user interaction dependent workload.

I am aware that this is a difficult problem to solve and, in many ways, goes against the point of the GC to try make it do things it doesn't want to do "optimally" from the GCs perspective. However, I think providing more primitives, even if it's easy to misuse them, that allow a developer in a high-performance scenario to dictate information to the garbage collector in hopes of avoiding a large GC pause would be appreciated.

@denis-paranichev
Copy link
Contributor

Hi, @Maoni0, did you have a chance to continue your work on generationless GC?

@VSadov
Copy link
Member

VSadov commented May 8, 2025

weird, I'd expect smaller gen0 to lead to more frequent GCs but smaller pauses overall (disclaimer: not a gc expert)

That's true only for programs that behave according to the simplifying assumptions that the generational GC designs are based on. The example above violates these simplifying assumptions. It ends up with a large Gen2 array that has to be scanned by Gen0 GC every time since it may contain Gen0 back references.

Here is a simpler example that amplifies the problem. It has Gen0 GC pause times around 0.3 seconds on my desktop machine. Changing the Gen0 budged does not have affect on the GC pause times.

object[] a = new object[100_000_000];

for (;;)
{
    // Create a lot of Gen2 -> Gen0 references to keep the GC busy
    object o = new object();
    for (int i = 0; i < a.Length; i++)
    {
        a[i] = o;
    }
    // Create a lot of short lived objects to trigger Gen0 GC
    for (int i = 0; i < 1000; i++)
    {
        GC.KeepAlive(new string('a', 10000));
    }
}

BTW: It is not unusual to see these kind of anti-patterns in "high-performance" code that tries to cache and reuse data structures to "save" the GC work.

This was referenced from a different issue (#96213), so I tried the scneario with Satori.

I see what the sample tries to achieve (deliberately trashes cards from a no-allocations code path while concurrent GC is in progress). I think it could become a problem in huge quantities, but GC should be able to handle some degree of abuse.

I've added pause instrumentation and the following code to force Gen1, because all-Gen0 is too easy.

                var obj = new string('a', 10000);
                GC.KeepAlive(obj);

                // force the obj into gen1, we know that Gen0 are fast.
                a[0] = new string('a', 10000);

With Satori in default mode I see:
(on win-x64 \w ryzen 5950X):

GCEventListener Created
GC#1 took 16.692ms for generation 2
GC#2 took 15.459ms for generation 2
GC#3 took 0.139ms for generation 1
GC#4 took 0.077ms for generation 1
GC#5 took 0.076ms for generation 1
GC#6 took 0.107ms for generation 1
GC#7 took 0.114ms for generation 1
GC#8 took 0.089ms for generation 1
GC#9 took 0.086ms for generation 1
GC#10 took 0.092ms for generation 1
. . . 
. . .
GC#100 took 0.098ms for generation 1
GC#101 took 0.118ms for generation 1
GC#102 took 0.109ms for generation 1
GC#103 took 0.086ms for generation 1
GC#104 took 0.095ms for generation 1
GC#105 took 0.110ms for generation 1
GC#106 took 0.104ms for generation 1
GC#107 took 0.116ms for generation 1
GC#108 took 0.143ms for generation 1

With set DOTNET_GCLatencyMode=3

GCEventListener Created
GC#1 took 0.189ms for generation 2
GC#2 took 0.094ms for generation 2
GC#3 took 0.156ms for generation 1
GC#4 took 0.096ms for generation 1
GC#5 took 0.116ms for generation 1
GC#6 took 0.103ms for generation 1
GC#7 took 0.064ms for generation 1
GC#8 took 0.094ms for generation 1
GC#9 took 0.112ms for generation 1
. . .
. . .

And with a[0] = new string('a', 10000); commented out
(NOTE: GCs under 10 microseconds are not printed as there are too many, thus there are gaps in the sequence)

GCEventListener Created
GC#1 took 0.026ms for generation 0
GC#2 took 0.141ms for generation 2
GC#3 took 0.014ms for generation 0
GC#9 took 0.026ms for generation 0
GC#19 took 0.026ms for generation 0
GC#29 took 0.026ms for generation 0
GC#39 took 0.030ms for generation 0
GC#49 took 0.033ms for generation 0
GC#52 took 0.032ms for generation 0
GC#60 took 0.028ms for generation 0
GC#70 took 0.029ms for generation 0
GC#81 took 0.023ms for generation 0
GC#91 took 0.026ms for generation 0
GC#102 took 0.027ms for generation 0
GC#113 took 0.033ms for generation 0
GC#124 took 0.031ms for generation 0
GC#135 took 0.024ms for generation 0
GC#146 took 0.029ms for generation 0
GC#158 took 0.033ms for generation 0
GC#169 took 0.025ms for generation 0

The full source of the benchmark:

using System;
using System.Threading;
using System.Collections.Generic;
using System.Diagnostics.Tracing;
using System.Collections.Concurrent;

class Program
{
    static void Main()
    {
        // System.Runtime.GCSettings.LatencyMode = System.Runtime.GCLatencyMode.LowLatency;

        GCEventListener.Start();
        Test();
    }

    public static void Test()
    {
        object[] a = new object[100_000_000];

        for (; ; )
        {
            // Create a lot of Gen2 -> Gen0 references to keep the GC busy
            object o = new object();
            for (int i = 0; i < a.Length; i++)
            {
                a[i] = o;
            }
            // Create a lot of short lived objects to trigger Gen0 GC
            for (int i = 0; i < 1000; i++)
            {
                var obj = new string('a', 10000);
                GC.KeepAlive(obj);

                // force the obj into gen1, we know that Gen0 are fast.
                a[0] = new string('a', 10000);
            }
        }
    }
}

public sealed class GCEventListener : EventListener
{
    private static GCEventListener instance = null;

    private GCEventListener()
    {
        Console.WriteLine("GCEventListener Created");
    }

    public static void Start(bool verbose = false)
    {
        if (instance == null)
            instance = new GCEventListener();
    }

    // Called whenever an EventSource is created.
    protected override void OnEventSourceCreated(EventSource eventSource)
    {
        // Watch for the .NET runtime EventSource and enable all of its events.
        if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
        {
            EnableEvents(eventSource, EventLevel.Informational, (EventKeywords)0x1);
        }
    }

    private ConcurrentDictionary<uint, long> gcStarts = new ConcurrentDictionary<uint, long>();

    // Called whenever an event is written.
    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        if (eventData.EventName.Contains("GCStart"))
        {
            uint gcIndex = (uint)eventData.Payload[0];
            long timeGCStart = eventData.TimeStamp.Ticks;
            gcStarts.TryAdd(gcIndex, timeGCStart);
            return;
        }

        if (eventData.EventName.Contains("GCEnd"))
        {
            uint gcIndex = (uint)eventData.Payload[0];

            if (gcStarts.TryRemove(gcIndex, out var timeGCStart))
            {
                long timeGCEnd = eventData.TimeStamp.Ticks;
                double pauseMsec = (double)(timeGCEnd - timeGCStart) / 10.0 / 1000.0;

                // ignore too short ones, to print less
                if (pauseMsec > 0.01)
                {
                    Console.WriteLine("GC#{0} took {1:f3}ms for generation {2}", gcIndex, pauseMsec, eventData.Payload[1]);
                }
            }
        }
    }
}

@AlgorithmsAreCool
Copy link
Contributor

AlgorithmsAreCool commented May 8, 2025

Those first couple of gen2s are pretty bad outliers, but it looks like the gen 1 pauses are at least consistent if not slightly worse on than the traditional collector

@VSadov
Copy link
Member

VSadov commented May 8, 2025

I think the outliers were likely compacting GCs as in default mode Satori will compact if it sees benefit in doing that.

In the next runs with low latency mode it does not compact, so no outliers.

@AlgorithmsAreCool
Copy link
Contributor

I'm running them locally in a few different GC modes, i'll report back with histograms

@AlgorithmsAreCool
Copy link
Contributor

Benchmark details posted here : #96213 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests