Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@reynir
Copy link
Contributor

@reynir reynir commented Dec 17, 2020

Call accept again if ECONNABORTED.
Fixes #829

@reynir
Copy link
Contributor Author

reynir commented Dec 17, 2020

I am able to reproduce this bug somewhat consistently (happens more than half the time) with nmap -p 3000 from my laptop to the FreeBSD server on the internet. With opium 0.19.0 that uses Lwt_io.establish_server_with_client_socket and this patch I was not able to reproduce after 1000 attempts.

@reynir
Copy link
Contributor Author

reynir commented Dec 17, 2020

I saw the exception was raised from Lwt_unix.retry_syscall, and it handles Unix_error (EAGAIN|EWOULDBLOCK|EINTR, _, _). If ECONNABORTED only arises in accept then I'm tempted to think that should handle it as well. It's unclear to me if that's the case.
https://github.com/ocsigen/lwt/blob/master/src/unix/lwt_unix.cppo.ml#L496

@hannesm
Copy link
Contributor

hannesm commented Dec 17, 2020

I was not aware that ECONNABORTED is not handled by Lwt_unix. I'd be in favor to handle it on the Lwt_unix level by a guard in accept_and_set_nonblock:

# git diff -w
diff --git a/src/unix/lwt_unix.cppo.ml b/src/unix/lwt_unix.cppo.ml
index cf0643ec5..0a519ea3c 100644
--- a/src/unix/lwt_unix.cppo.ml
+++ b/src/unix/lwt_unix.cppo.ml
@@ -1686,12 +1686,15 @@ external accept4 :
     Unix.file_descr * Unix.sockaddr = "lwt_unix_accept4"
 
 let accept_and_set_nonblock ch_fd =
+  try
     if Lwt_config._HAVE_ACCEPT4 then
       let (fd, addr) = accept4 ~close_on_exec:false ~nonblock:true ch_fd in
       (mk_ch ~blocking:false ~set_flags:false fd, addr)
     else
       let (fd, addr) = Unix.accept ch_fd in
       (mk_ch ~blocking:false fd, addr)
+  with
+  | Unix.Unix_error (Unix.ECONNABORTED, _, _) -> raise Retry
 
 let accept ch =
   wrap_syscall Read ch (fun _ -> accept_and_set_nonblock ch.fd)

At least I have only be bothered by ECONNABORTED and never needed to do something else than calling accept again for the next client connection.

EDIT: the above diff changes the (error) semantics of accept and deserves the documentation string to be updated. I'm curious whether there are any users of Lwt_unix.accept that want to handle ECONNABORTED in a different way.

@raphael-proust
Copy link
Collaborator

I haven't written as many servers as either of you, so I won't be much help about how this unix error is generally handled.

Note that Lwt_unix generally tries to keep to OCaml's Unix semantics. The documentation of accept is just Wrapper for [Unix.accept]. Although it already handles EAGAIN, EWOULDBLOCK, and EINTR (via wrap_syscall). Because of this, I'm inclined to say that it is ok to handle this one additional error.

  • I'll leave this MR open for a little bit (and the error and other MR as well) in case someone has specific opinions against it.
  • I'd like the docstrings to be updated as part of this change and an entry added to the changelog. I'll do it before it is merged.

@aantron
Copy link
Collaborator

aantron commented Dec 28, 2020

it already handles EAGAIN, EWOULDBLOCK, and EINTR (via wrap_syscall).

Lwt does this only because it is necessary for how Lwt does non-blocking I/O (which is, at first, just try to do the I/O immediately, and let the system tell Lwt if I/O failed and Lwt needs to queue it for later). EINTR is arguable here.

AFAIK ECONNABORTED is not related to non-blocking I/O. I can't immediately see any reason why it should be handled internally in Lwt like EAGAIN, EWOULDBLOCK, and EINTR are. So it should definitely not be handled in the same place as these error codes.

I saw the exception was raised from Lwt_unix.retry_syscall, and it handles Unix_error (EAGAIN|EWOULDBLOCK|EINTR, _, _). If ECONNABORTED only arises in accept then I'm tempted to think that should handle it as well.

So that would be the wrong place to handle it, if anywhere.

Separately, establish_server_* are high-level functions that may benefit from handling this exception. However, this only makes sense if there is no user that would like to know about these exceptions. I agree with @hannesm query:

I'm curious whether there are any users of Lwt_unix.accept that want to handle ECONNABORTED in a different way.

and

I was not aware that ECONNABORTED is not handled by Lwt_unix.

Why should it be handled by Lwt_unix? Lwt_unix is supposed to make system calls appear to be non-blocking, and do nothing else. I don't think it should swallow any errors unless that is necessary for non-blocking operation, like EAGAIN and the others. Again, if this should be handled anywhere, it is in establish_server in Lwt_io, and even that seems dubious. The strongest argument for handling it in establish_server may be that the user of establish_server has no way to handle it themselves (I haven't looked at it again in enough detail to say).

@aantron
Copy link
Collaborator

aantron commented Dec 28, 2020

For comparison, you can call accept(2) in C and have it fail with ECONNABORTED, so a robust server in C needs to be ready for that. The same should be true of a server written with Lwt_unix, though perhaps not of a server written using Lwt_io.establish_server_*.

@hannesm
Copy link
Contributor

hannesm commented Jan 4, 2021

@aantron I understand your argument, and agree that only Lwt_io.establish_server_* should handle ECONNABORTED.

@reynir
Copy link
Contributor Author

reynir commented Jan 4, 2021

I appreciate the goal to have Lwt_unix be a non-blocking wrapper around Unix.

AFAICT it's not possible to handle ECONNABORTED as a user of establish_server_* other than calling it again. I think it would make sense for it to be handled in establish_server_*

@raphael-proust
Copy link
Collaborator

Hey @reynir , sorry for the stall.

I rebased on master (I had to force push, don't hesitate to force push some local change you'd have had). I also added an entry in CHANGES.

AFAICT, the PR is now ready. I'll give it one more read to make sure and to leave some time for others to make comments too.

@hannesm
Copy link
Contributor

hannesm commented May 26, 2021

ping - could this be merged and released? thanks! :)

@raphael-proust
Copy link
Collaborator

I think this can be included in the upcoming bugfix release

@raphael-proust raphael-proust merged commit 650d64e into ocsigen:master May 27, 2021
@raphael-proust raphael-proust added this to the 5.4.1 milestone May 28, 2021
@reynir reynir deleted the econnaborted branch February 20, 2023 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lwt_io.establish_server_generic doesn't handle ECONNABORTED

4 participants