Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Sockets.Unix race between receive completion and cancellation? #115217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tmds opened this issue May 1, 2025 · 2 comments
Open

Sockets.Unix race between receive completion and cancellation? #115217

tmds opened this issue May 1, 2025 · 2 comments

Comments

@tmds
Copy link
Member

tmds commented May 1, 2025

I'm doing some testing of https://github.com/tmds/Tmds.Ssh/ and I occasionally get an unexpected runtime crash:

Fatal error. Internal CLR error. (0x80131506)
   at System.Runtime.EH.DispatchEx(System.Runtime.StackFrameIterator ByRef, ExInfo ByRef)
   at System.Runtime.EH.RhThrowEx(System.Object, ExInfo ByRef)
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at System.Threading.CancellationToken.ThrowIfCancellationRequested()
   at System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.ThrowException(System.Net.Sockets.SocketError, System.Threading.CancellationToken)
   at System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16)
   at Tmds.Ssh.StreamSshConnection+<ReceiveAsync>d__21.MoveNext()
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Int32, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext(System.Threading.Thread)
   at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1[[System.Boolean, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].SetResult(Boolean)
   at System.Net.Sockets.SocketAsyncEventArgs.TransferCompletionCallbackCore(Int32, System.Memory`1<Byte>, System.Net.Sockets.SocketFlags, System.Net.Sockets.SocketError)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart()

Based on the stacktrace, I think this is due to a race between a receive operation on the socket that is completing succesfully (TransferCompletionCallbackCore at the bottom of the stack), and that receive operation also completing due to cancellation (CancellationToken.ThrowOperationCanceledException at the top of the stack).

To support that hypothesis, I changed Tmds.Ssh's receive code to cancel through Task.WaitAsync rather than cancelling the socket operation. When I make this change, the crashes no longer occur.

     private async ValueTask<int> ReceiveAsync(CancellationToken ct)
     {
         var memory = _receiveBuffer.AllocGetMemory(Constants.PreferredBufferSize);
-        int received = await _stream.ReadAsync(memory, ct).ConfigureAwait(false);
+        int received;
+        Task<int> receiveTask = _stream.ReadAsync(memory).AsTask();
+        try
+        {
+            received = await receiveTask.WaitAsync(ct).ConfigureAwait(false);
+        }
+        catch
+        {
+            (_stream as System.Net.Sockets.NetworkStream)!.Socket.Dispose();
+
+            await receiveTask.ConfigureAwait(false);
+
+            throw;
+        }
+
         _receiveBuffer.AppendAlloced(received);
         return received;
     }

I saw this issue on a setup I create specifically for my tests (with cloud VMs). I don't have a easy reproducer atm.

I have plenty of things on my plate for the next week or two. After that, I should have some time to look into this further and provide additional information and do some debugging.

cc @karelz @antonfirsov @stephentoub

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 1, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@wfurt
Copy link
Member

wfurt commented May 1, 2025

We should look into it. I don't see reason why UDS would be any different but it sounds suspicious.

@wfurt wfurt removed the untriaged New issue has not been triaged by the area owner label May 1, 2025
@wfurt wfurt added this to the 10.0.0 milestone May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants