Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@ntherning
Copy link
Contributor

This is the behavior of .NET. After this patch the code on Mono for Windows will make sure the underlying native thread of a runtime created thread has died before Thread.Join() returns.

This PR should make some of the tests in WaitHandleTest less flaky on Windows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't mono_join_uninterrupted () already do this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this use the same mechanics as mono_threads_join_threads ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vargaz IIUC mono_join_uninterrupted waits for an event to be signalled in the other thread (in mono_threads_signal_thread_handle which is called by unregister_thread). When the event is signalled the native thread has not yet been fully terminated.

@kumpera I guess you mean doing something similar to https://github.com/mono/mono/blob/master/mono/metadata/threads.c#L5048? Is that entire sequence starting with MONO_ENTER_GC_SAFE required?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean using that very thing, but replacing pthread_join with WaitOne

@luhenry
Copy link
Contributor

luhenry commented Jun 14, 2017

@niklas why is it flaky if we don't do that? Isn't it less flaky because some cases of races have just less chances of happening because it's a bit slower?

@ntherning
Copy link
Contributor Author

@luhenry The tests in WaitHandleTest which checks for abandoned mutexes on Windows. On Windows the mutex state won't change to abandoned until the native thread has died completely. But Our Thread.Join() doesn't guarantee this so there's a slight chance that Join() returns and the mutex state is checked before the other thread is completely dead. Those tests fail every now and then on Windows as you can see in Jenkins and locally. I cannot reproduce this failure on .NET so I assume that Thread.Join() in .NET doesn't return until the native thread of the other thread is actually terminated.

@ntherning ntherning force-pushed the wait-for-native-thread-to-die-in-Thread-Join-on-windows branch from a100cbb to a54c7ac Compare June 16, 2017 11:36
@ntherning
Copy link
Contributor Author

build

@ntherning
Copy link
Contributor Author

This appears to build fine and tests run as expected. Any objections to merging it? @kumpera You had a comment on the implementation?

@ntherning
Copy link
Contributor Author

@kumpera @luhenry Please comment if there's anything I need to change in this PR. I'll merge this later this week if I hear no objections.

@kumpera
Copy link
Contributor

kumpera commented Jun 27, 2017

@ntherning I think we must first explore on whether extending the existing thread join code from unix to windows would not solve it.

We do it in this way to ensure proper lifetime of the underlying OS primitive and I'm not sure your approach allows for it.

@ntherning
Copy link
Contributor Author

@kumpera One thing this PR addresses is the abandoned mutex issue which makes WaitHandleTest.WaitAnyWithSecondMutexAbandoned fail randomly. If we want to keep using real Windows mutexes I don't see how we can replicate the behavior of .NET without waiting for the underlying native thread to die in Thread.Join() like mono will do with this PR applied. To my knowledge there's no way to set a Windows mutex to abandoned without the owning native thread actually dying without releasing it.

The alternative to this PR would be to change Mutex on Windows to emulate mutexes using some other OS primitive and maintain the state ourselves, similarly to how it's done on unix. But that would break interoperability when managed code and native code are sharing mutexes. So not really an option.

@vargaz
Copy link
Contributor

vargaz commented Jun 28, 2017

Wouldn't calling mono_native_thread_join () solve this ?

@ntherning
Copy link
Contributor Author

mono_native_thread_join() does almost exactly what the PR does (calls WaitForSingleObject()). I can make that change.

I don't think this is what @kumpera had in mind though, but rather, if possible, align the thread joining on Windows with what we do on unix. As I tried to explain I don't think there's a way to do that while replicating what .NET does regarding mutexes.

@ntherning
Copy link
Contributor Author

@kumpera I still need your comment on my previous reply.

@luhenry
Copy link
Contributor

luhenry commented Aug 7, 2017

@ntherning if I understand correctly what @kumpera means is: please modify mono_thread_join / mono_threads_join_threads / mono_threads_add_joinable_thread to support win32, and use these functions, so we can use the same mechanism between Unix and Windows.

This is the behavior of .NET. After this patch the code on Mono for Windows
will make sure the underlying native thread of a runtime created thread has
died before Thread.Join() returns.
@ntherning ntherning force-pushed the wait-for-native-thread-to-die-in-Thread-Join-on-windows branch from a54c7ac to 6bf491a Compare August 23, 2017 13:54
@ntherning ntherning requested a review from luhenry as a code owner August 23, 2017 13:54
@ntherning
Copy link
Contributor Author

But AFAICT ves_icall_System_Threading_Thread_Join_internal(), which this PR patches, doesn't use any of those functions you mention on any platform. This PR is about aligning the behavior of Thread.Join() on Mono on Windows with that of .NET, which will fix the problems we see in MonoTests.System.Threading.WaitHandleTest.WaitAnyWithSecondMutexAbandoned. I'm not trying to make other thread joins (internal to the runtime) behave like this on Windows. I don't think that is even desirable.

@kumpera
Copy link
Contributor

kumpera commented Aug 23, 2017

Hi Niklas,

How about the following:

  1. move threads.c:mono_join_uninterrupted to the w32 layer (or mono-thread depending on fit).
  2. keep the current code as is for the unix version
  3. On the windows version, do a WaitOne on the actual thread code.

if (is_runtime_thread) {
// The thread was created by the runtime. Make sure the underlying
// native thread has terminated before we return.
WaitForSingleObjectEx (thread->native_handle, INFINITE, FALSE);
Copy link
Contributor

@kumpera kumpera Aug 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, the WaitOne should respect the timeout of the icall.

@vargaz
Copy link
Contributor

vargaz commented Aug 24, 2017

How about we call mono_native_thread_join () which should do the same ?

@kumpera
Copy link
Contributor

kumpera commented Aug 24, 2017

@vargaz AFAICT, we can't do that cuz pthread_join would fail on detached threads.
This is one of the reasons we have this convoluted thing in place.

OTOH, we could merge this as is and queue cleanup work to follow it. In such case, we'd only need to address the join timeout issue.

@vargaz
Copy link
Contributor

vargaz commented Aug 24, 2017

We don't detach threads anymore, i.e. don't call pthread_detach ().

@kumpera
Copy link
Contributor

kumpera commented Aug 24, 2017

Then I guess it's about how much @ntherning wants to go down the rabbit role.

@vargaz
Copy link
Contributor

vargaz commented Aug 29, 2017

Superseded by #5454.

@vargaz vargaz closed this Aug 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants