Eliminate all SOCKET_TO_FD usage in wolfcrypt, more error checking around invalid sockets #290
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
OK between this and the poll fixes in gnulib I am unable to recreate any deadlocks or faults even with a dozen+ transfers with concurrency.
The big thing here is removing all the calls to the SOCKET_TO_FD macro. While FD_TO_SOCKET is totally safe, the inverse is not. Every time it is called a new fd is created. Note this was modeled similarly to:
wget2/libwget/ssl_openssl.c
Line 1785 in 14d0460
To avoid using that we need to be keeping a reference to the FD and the socket (well technically only the FD but might as well track both as right now we track the socket instead anyway).
Doing this the cleanest way I figured was to extend the wolfssl session object to a new wolf_extended_session struct that stores the FD/socket references as well. Keeping a lookup table or storing it in the global session info (and adding another accessor method to get it off the session) seemed worse.
Note there is a good bit of commented code in ssl_wolfssl.c and an existing (seemingly unused)
struct session_context. I haven't touched most of this, and I am not sure if it is TODO code or more old code that is no longer needed, but adding my new extended session struct may make things more confusing.Briefly looking at the openssl code I think we would need to do similar there.
The error message in wget_ready_2_transfer I am not sure the best way to do so. Honestly the ability to call poll with an infinite timeout and no valid FD seems stupid to support but is technically a valid (or at least allowed) use case in *nix that hangs indefinitely. The definition says that any negative FD's are ignored, the idea being you invert the FD to avoid having to remove it from the set to just not watch it, although -1 isn't really a valid FD or inverted FD but would meet that condition. In addition it does explicitly call out that calling poll with no FD but a timeout just makes it a glorified sleep so that condition one wants to preserve. All of this is me saying that while calling with a invalid FD is legal, we might as well throw an error on our side. It is more a developer message than anything, and we are returning an error code but it could save someone some headaches. Doing it as a debug message seemed decent (it would have caught thing like some of the earlier wolfssl bugs fixed).
Finally, the other half of the deadlock resolution are the poll fixes. Here is my planned patch for upstream:
mitchcapper/gnulib@master...ours_win32_poll_fixes
we already have the poll_retry code in, my suggestion would be to add the poll function in its entirety under a different name (tmp_poll? updated_poll?) and change the poll_retry to use it. Once gnulib changes then we could just remove both and use the upstream function. I can do a PR for this if it sounds like a good idea.