feat(tls): implement non-blocking TrySend with async flush offload #516

glevkovich · 2025-12-14T12:43:44Z

Implemented TlsSocket::TrySend using a non-blocking state machine loop to handle upstream backpressure and TLS state requirements (NEED_READ/NEED_WRITE) without unnecessary context switching.

Key changes include:

Async Flush Offload: Introduced a "Fire and Hold" mechanism. If TrySend successfully consumes all user data but fails to fully flush the pending ciphertext to the network (partial write), the remaining flush is offloaded to a detached background AsyncReq. This prevents stalling the caller while ensuring data safety.
Async Logic Update: Updated AsyncRoleBasedAction to correctly handle detached flush requests (where vec == nullptr), allowing the background fiber to exit gracefully once the buffer is drained.
Small Buffer Optimization (SSO): Applied SBO using absl::InlinedVector for iovec copying to minimize heap allocations for standard batch sizes.
Build System: Added iovec_utils.cc to the tls_lib target in CMakeLists.txt.

Testing improvements:

Scatter-Gather Tests: Added TrySendVectorTest to validate behavior with various iovec counts and split patterns.
White-Box Testing: Introduced MockTlsSocketTest infrastructure with Strict/Nice mocks for the Proactor, FiberSocket, and TlsEngine.
Edge Case Coverage: Added TrySendErrorTest to verify handling of EAGAIN, Dirty Shutdowns, and Concurrency Conflicts.
Async Verification: Added TrySendAsyncFlushTest to verify that stranded data is correctly offloaded to the async path.
Bug Fix: Fixed RegisterOnRecv test to explicitly call ResetOnRecvHook() before manual TryRecv, resolving a "Concurrent TryRecv and Recv" usage error.

Copilot

Pull request overview

This PR implements non-blocking TrySend functionality for TLS sockets with support for scatter-gather I/O and comprehensive error handling. The implementation uses a state machine loop to handle TLS engine requirements (NEED_WRITE, NEED_READ) without fiber context switching, applies Small Buffer Optimization (SBO) to minimize heap allocations for common iovec counts, and includes extensive test coverage for both happy paths and error scenarios.

Key changes:

Implemented TlsSocket::TrySend with non-blocking state machine handling for TLS engine opcodes
Added AdvanceIovec helper for correctly tracking partial consumption in scatter-gather arrays
Applied SBO pattern (stack buffer for ≤16 iovecs) to reduce allocations

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File	Description
util/tls/tls_socket.h	Added `AdvanceIovec` helper method declaration and improved its documentation
util/tls/tls_socket.cc	Implemented `TrySend` methods with state machine loop, `AdvanceIovec` helper, and SBO optimization; includes flush-encrypt loop with error handling
util/tls/tls_socket_test.cc	Added parameterized scatter-gather tests (`TrySendVectorTest`) for various iovec counts; added comprehensive mock-based error scenario tests (`TrySendErrorTest`) covering concurrency guards, flush blockages, renegotiation, and fatal errors; updated `RegisterOnRecv` test to use new `TrySend`; improved parameter parsing for uring detection

util/tls/tls_socket.cc

util/tls/tls_socket_test.cc

util/tls/tls_socket.cc

util/tls/tls_socket.h

util/tls/tls_socket.cc

codecov-commenter · 2025-12-14T12:52:59Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 87.34177% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.43%. Comparing base (dd0b9d2) to head (df7ced4).

Files with missing lines	Patch %	Lines
util/tls/tls_socket_test.cc	91.00%	18 Missing ⚠️
util/tls/iovec_utils.cc	62.06%	11 Missing ⚠️
util/tls/tls_socket.cc	87.35%	11 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #516      +/-   ##
==========================================
+ Coverage   78.12%   78.43%   +0.30%     
==========================================
  Files         116      117       +1     
  Lines       10319    10629     +310     
==========================================
+ Hits         8062     8337     +275     
- Misses       2257     2292      +35

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

util/tls/tls_socket.cc

romange · 2025-12-15T11:51:17Z

util/tls/tls_socket.cc

+  bool has_data{false};
+  for (size_t i{}; i < len; ++i) {
+    if (v[i].iov_len > 0) {
+      has_data = true;
+      break;
+    }
+  }
+  if (!has_data)
+    return 0;  // nothing to send


can be factored out into a helper function, and lets DCHECK as well, as I do not see why should we allow callers to call this function with no data

I'll put a helper, thanks - see my other comment about iovec_utils.cc/h. I'm going to add part of it there.

I'll add a DCHECK since I can assume we never want to send 0 data in debug.

Production: Standard POSIX behaviour (specifically writev and sendmsg) dictates that if you pass a valid iovec array where the sum of lengths is zero, the system call simply returns 0 and does nothing. It is not an error, and we cannot crash on it.

util/tls/tls_socket.cc

romange · 2025-12-15T11:53:39Z

util/tls/tls_socket.cc

+    DVSOCK(3) << "TrySend blocked: WRITE_IN_PROGRESS detected";
+    return make_unexpected(make_error_code(errc::resource_unavailable_try_again));
+  }
+  bool read_in_progress{(state_ & READ_IN_PROGRESS) != 0};


should we DCHECK here that engine_->OutputPending() == 0 ?

I don't think so:

TrySend does not control SSL engine exclusively. I have no idea who called the engine before, maybe via other function. Maybe oms read on the socket generated SSL metadata on the output , and other cases.

TrySend can run concurrently with some Read operation, so again OutputPending() can become non-zero.

romange · 2025-12-15T11:56:11Z

util/tls/tls_socket.cc

+
+  while ((curr_iovec_len > 0) || (engine_->OutputPending() > 0)) {
+    // 1. Flush into the upstream socket any pending output from the engine output buffer before
+    // pushing more data to the engine from the user. These might be bytes from previous call.


I am not sure it's the best approach.

batching writes is beneficial - we do not want to send a packet per few bytes, we already had such bugs.

if engine_->OutputPending() > 0 that should mean that there is an asynchronous process that takes care of it, imho.

Regarding Batching: You are right that the current "Flush-Push-Flush" loop could result in small packets (one per iovec), which isn't ideal for performance. I will try to refactor this to Send only once in multiple PushToEngine calls before triggering a flush to the upstream socket. If this makes the non-blocking logic too complex for this single PR, I will merge this correct version first and optimize this in a dedicated follow-up PR to keep the changes manageable.

Regarding OutputPending & Async Flushing: I have to disagree on the "asynchronous process" point. At the very top of TrySend, I check:
if ((state_ & WRITE_IN_PROGRESS) != 0) return ... try_again;
If there were any asynchronous process or background fiber currently flushing this buffer, WRITE_IN_PROGRESS would be set, and I would have exited immediately.
Since I reached this line, I am the exclusive writer. There is no other active process responsible for this data. If I don't flush OutputPending here, the data will sit in the SSL BIO indefinitely (causing latency) until the next API call happens to trigger a write. Therefore, it is my responsibility to flush it. Also, this is used as a single place to flush between iteration (only when needed, in the optimised-to-be-written version).

util/tls/tls_socket.h

util/tls/tls_socket_test.cc

util/tls/tls_socket.cc

romange · 2025-12-17T13:54:26Z

util/tls/tls_socket.cc

+        DVSOCK(3) << "Flushed " << *send_result << " bytes to upstream";
+        if ((*send_result) < output_buf.size()) {  // case 1.A: partial write
+          // upstream socket is full - try again later
+          returned_status = make_error_code(errc::resource_unavailable_try_again);


Bug. what ensures that the write will be flushed when next_sock_ becomes available?

make_error_code(errc::resource_unavailable_try_again); is wrong as you already consumed some of the input data and it was copied to ssl engine. From a caller perspective - they need to retry the entire operation, so they will try to push the same data again

I believe that for partial writes code is correct, there is only one edge case I will mention at the end which is a bit "tricky" (full write with partial flash of the SSL engine output).

====================

"what will push the data eventually(1)" ?
Since I return total_bytes_sent > 0, the caller MUST treats this as a successful partial write. Standard non-blocking socket behaviour implies the caller will eventually call TrySend (or TryRecv) again. iT MUST call TrySend again since the partial write implied also "try again after advancing your iovec".

When user do call TrySend again (even with new data), the very first block of this function:
if (engine_->OutputPending() > 0) { ... }
ensures that we attempt to flush the pending ciphertext before accepting any new user data.
Regarding the error handling (swallowing EAGAIN/errors): this mimics standard POSIX write() semantics. If a write partially succeeds but then hits a network error or blocks, we report the success first (we return total_bytes_sent>0) . The error remains pending and will be returned on the next call (when total_bytes_sent is 0). If error is fatal, it will still "wait for" next user call, if it's non-fatal - the next TrySend/TryRecv might not encounter the error anymore. There is one edge case to discuss at the end of this comment.

=========================
Regarding the duplication concern (2):
I believe the duplication concern does not exist, and is resolved by the return logic at the end of the function.
You are correct that returned_status is set to try_again, but notice the check at the very end:

if (total_bytes_sent > 0) { return total_bytes_sent; }

Even if returned_status contains an error (like EAGAIN or a socket error), if I have successfully pushed any bytes to the engine (total_bytes_sent > 0), I return that positive count, effectively masking the error for this specific call. This ensures that:

The caller sees a positive return value (partial write).

The caller advances their buffer pointers by total_bytes_sent.

The caller invokes TrySend again with the remaining (new) data.

Therefore, no data is duplicated. The user never retries with the same data because they received a positive confirmation for the chunk that was processed.

====================
Regarding the "Full Write, Partial Flush" scenario (this one is tricky):

Consider the case where the user sends 1000 bytes. We successfully push all 1000 bytes into the SSL engine (encrypting them), but the upstream socket only accepts 500 bytes of the ciphertext before returning EAGAIN.
In this state:
I must return 1000 (success). All the user data has been consumed and encrypted (but some of it not sent yet on the upstream socket). If I return EAGAIN here, the user will retry sending the same 1000 bytes. This would encrypt the data a second time, corrupting the TLS stream (duplication).

So, what ensures the flush?

My claim is that TrySend is a "best effort" function with minimalist return value which cant reflect all complex situations and must return the number of bytes sent even if there was an error. It's the duty of the caller to make sure the TLS engine output buffer is flashed by calling again to TrySend/TryRecv or using AsyncReq.

In details:

Immediate Retry: Since I return a positive byte count, the non-blocking contract implies the user (or upper layer) will continue to call TrySend (to send more data) or TryRecv (to wait for a response). Both functions begin by attempting to flush the engine buffer. But what if user do not call again since they do not want to send or receive more data? for that we have mechanism 2.

Async Layer Safety: For the async/fiber implementation (AsyncReq), we explicitly handle this state. In AsyncReq::MaybeSendOutputAsyncWithRead (and AsyncReadSome), we check engine_->OutputPending() and call StartUpstreamWrite() if positive This one register for write on the uostream socker in next_socket->AsyncWriteSome. If there is buffered ciphertext, we register in that function (next_socket_->AsyncWriteSome) for a Write event (EPOLLOUT) , even if the user requested a Read, preventing any deadlock.

we concluded that function is correct for all cases except one: when all user data sent and there is pending data n engine output buffer. In that case we would like to make this function "send and forget". Code will start an async process to make sure engine output buffer is flushed into upstream socket.

romange · 2025-12-17T13:55:05Z

util/tls/tls_socket.cc

+        }
+        // case 1.B: full write - fall through to the next step
+      } else {  // case 1.C: write failed (EAGAIN or other Error).
+        returned_status = send_result.error();


same thing - what will push the data eventually?

Regarding a general error case (General Error): This follows standard POSIX partial write semantics. If we successfully processed some bytes but then hit a hard error (e.g., Broken Pipe), we return the total_bytes_sent first . The next call will attempt to flush, hit the error immediately (with total_bytes_sent == 0), and correctly return the error to the user. The edge case of full write will partial flash - see here at the end: feat(tls): implement non-blocking TrySend with async flush offload #516 (comment)

we concluded that function is correct for all cases except one: when all user data sent and there is pending data n engine output buffer. In that case we would like to make this function "send and forget". Code will start an async process to make sure engine output buffer is flushed into upstream socket.

- Implemented TlsSocket::TrySend using a non-blocking state machine loop to handle NEED_WRITE and NEED_READ without context switching. - Added AdvanceIovec helper to correctly track partial consumption of scatter-gather arrays. - Applied Small Buffer Optimization (SBO) to iovec vectors, using stack storage for small batches to minimize heap allocations. - Simplified the flush/encrypt loop structure for improved readability and reduced code size. - Updated tls_socket_test with parameterized scatter-gather tests and mock-based error scenarios. Signed-off-by: Gil Levkovich <[email protected]>

Signed-off-by: Gil Levkovich <[email protected]>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

util/tls/tls_socket_test.cc

util/tls/iovec_utils.cc

util/tls/tls_socket_test.cc

util/tls/iovec_utils.cc

util/tls/tls_socket.cc

Signed-off-by: Gil Levkovich <[email protected]>

glevkovich requested a review from Copilot December 14, 2025 12:43

Copilot started reviewing on behalf of glevkovich December 14, 2025 12:44 View session

Copilot AI reviewed Dec 14, 2025

View reviewed changes

glevkovich force-pushed the glevkovich/improve_io_flow_trysend_impl branch 3 times, most recently from b034858 to 7771310 Compare December 14, 2025 13:29

glevkovich requested a review from romange December 14, 2025 13:30

glevkovich force-pushed the glevkovich/improve_io_flow_trysend_impl branch from 7771310 to f3dad3a Compare December 14, 2025 13:52

glevkovich marked this pull request as ready for review December 14, 2025 14:44

romange reviewed Dec 15, 2025

View reviewed changes

romange reviewed Dec 17, 2025

View reviewed changes

util/tls/tls_socket.cc Outdated Show resolved Hide resolved

romange requested changes Dec 17, 2025

View reviewed changes

glevkovich added 4 commits December 17, 2025 16:53

wp pass

568fb28

fix Pr comments

b16157d

Signed-off-by: Gil Levkovich <[email protected]>

fix pr comments

ede2ab2

Signed-off-by: Gil Levkovich <[email protected]>

glevkovich force-pushed the glevkovich/improve_io_flow_trysend_impl branch from d05c05a to ede2ab2 Compare December 17, 2025 17:33

fix pr comments

df9b9df

Signed-off-by: Gil Levkovich <[email protected]>

glevkovich requested a review from Copilot December 18, 2025 11:25

Copilot started reviewing on behalf of glevkovich December 18, 2025 11:26 View session

Copilot AI reviewed Dec 18, 2025

View reviewed changes

glevkovich added 2 commits December 18, 2025 14:25

fix more PR comments and typos

c87f8e7

Signed-off-by: Gil Levkovich <[email protected]>

fix bug in TlsSocketTest RegisterOnRecv test case

df7ced4

Signed-off-by: Gil Levkovich <[email protected]>

glevkovich changed the title ~~feat(tls): implement non-blocking TrySend with SBO and state handling~~ feat(tls): implement non-blocking TrySend with async flush offload Dec 18, 2025

feat(tls): implement non-blocking TrySend with async flush offload #516

Are you sure you want to change the base?

feat(tls): implement non-blocking TrySend with async flush offload #516

Conversation

glevkovich commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glevkovich Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glevkovich Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

glevkovich commented Dec 14, 2025 •

edited

Loading

codecov-commenter commented Dec 14, 2025 •

edited

Loading

glevkovich Dec 16, 2025 •

edited

Loading

glevkovich Dec 17, 2025 •

edited

Loading