Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@alexrp
Copy link
Contributor

@alexrp alexrp commented Nov 28, 2017

With the revamped GC root reporting work, some GC root unregister events will arrive after the thread_stopped event. This will cause the thread to be re-added to the thread list in the profiler, and never be removed until program exit. This meant that if such a thread had actually exited and a new thread would reuse its thread ID, that new thread would fail to add itself to the thread list, leading to failures like this one:

init_thread: failed to insert thread 0x70000b761000 in log_profiler.profiler_thread_list, found = true

By using the new thread_exited event, we remove the thread from the thread list after these GC root unregister events have arrived, thereby ensuring that the thread won't be incorrectly 'resurrected'.

@alexrp
Copy link
Contributor Author

alexrp commented Nov 28, 2017

@luhenry could you try folding this into #5710 temporarily to see if it fixes the assertions in the stress tests?

@luhenry
Copy link
Contributor

luhenry commented Nov 28, 2017

@alexrp I added the commits to #5710. Also FYI you are able to push whatever commits you want to the PR since the "Allow edits from maintainers" is checked, so next time feel free to push a fix if you think it's needed or helpful Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting change, cuz the old code was incorrect.
MONO_PROFILER_RAISE uses hazard pointers in the same way as mono_thread_info_lookup, so whatever we need to be protected would be overwritten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the callbacks invoked by MONO_PROFILER_RAISE may or may not use hazard pointers. The mono_hazard_pointer_clear below was there to clear HP1 (which would have been set by mono_thread_info_lookup) just in case the profiler callbacks wouldn't have overwritten it with something else. Was that approach actually problematic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this keep happening frequently, we could return a dummy MonoProfilerThread instead.

Copy link
Contributor Author

@alexrp alexrp Nov 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should never happen, and indeed never did outside #5710, where it only happened because that PR introduced events that got fired after the thread_stopped event. I think an assertion is fine, we don't want a bug like this to go unnoticed. (Also, it already passed the stress tests in an earlier run of this PR, so I think we're good.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have thread_exited introduce a new callback typedef while thread_stopping doesn't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy/paste error.

Alex Rønne Petersen added 4 commits November 28, 2017 19:06
thread_stopping occurs earlier than thread_stopped, before any of the detach
code has run. thread_exited occurs after thread_stopped, once all detach logic
has finished.
…events.

With the revamped GC root reporting work, some GC root unregister events will
arrive after the thread_stopped event. This will cause the thread to be
re-added to the thread list in the profiler, and never be removed until program
exit. This meant that if such a thread had actually exited and a new thread
would reuse its thread ID, that new thread would fail to add itself to the
thread list, leading to failures like this one:

    init_thread: failed to insert thread 0x70000b761000 in log_profiler.profiler_thread_list, found = true

By using the new thread_exited event, we remove the thread from the thread list
after these GC root unregister events have arrived, thereby ensuring that the
thread won't be incorrectly 'resurrected'.
…ents.

These checks are no longer necessary as these thread callbacks are not invoked
for tools threads in the first place.
@alexrp
Copy link
Contributor Author

alexrp commented Nov 28, 2017

@DavidKarlas said this fixes the assertions for him.

@alexrp
Copy link
Contributor Author

alexrp commented Nov 28, 2017

@monojenkins merge

@alexrp
Copy link
Contributor Author

alexrp commented Nov 28, 2017

Stress test failure is unrelated and will be fixed by #6109.

@monojenkins
Copy link
Contributor

cannot merge:

  • "Linux i386" state is "failure"
  • "Linux x64" state is "success"
  • "Linux ARMv7 hard float" state is "failure"
  • "Linux AArch64" state is "success"
  • "OS X i386" state is "success"
  • "OS X x64" state is "success"
  • "Windows i386" state is "success"
  • "Windows x64" state is "success"

@kumpera
Copy link
Contributor

kumpera commented Nov 29, 2017

Stress test is failing on * Assertion at loader.c:1898, condition res != NULL' not met` in get_method_constrained.

@kumpera kumpera merged commit 0c88bc2 into mono:master Nov 29, 2017
@alexrp alexrp deleted the profiler-lls-fix branch November 30, 2017 06:22
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
[profiler] Fix thread list insertion failures.

Commit migrated from mono/mono@0c88bc2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants