-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Handle internal overflow of the write buffer in WriteAsync by introdu… #2802
Conversation
…cing new cases: 1) Internal buffer overflows, but write can fit in next buffer - Fill the existing buffer, copy the remaining data into a new buffer, and flush the original to disk 2) Internal buffer is non-empty, overflows, and the remaining data won't fit in a second buffer - Chain the flush operation to a second write operation which writes the entire incoming block directly to disk. Additionally, adds a new unit test to ensure the different buffering cases are being covered.
Hi @tymlipari, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution! TTYL, DNFBOT; |
@tymlipari, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not passing through the cancellationToken param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, likely a copy-paste error. Fixed in updated commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chaining like this is the right idea. However, using ContinueWith as is done is going to be problematic. ContinueWith executes the continuation regardless of whether the antecedent task completes successfully, due to cancellation, or due to faulting with an exception, and we'd only want to do the subsequent write if the flush completed successfully. If it didn't, we want to return the status from the FlushWriteAsync task instead of that of executing the write. The semantics of await are more appropriate here. As such, you probably want a helper async method that awaits the flush and then awaits the write, and that helper can be invoked here.
I thought more about the whole buffering issue. Seems like we have a few goals here: We also need to maintain existing behaviors as much as possible. For example, in going over the code, it’s clear that it goes out of its way to maintain the current Position, so that multiple asynchronous writes can be issued one after the other to execute concurrently, such that the resulting sequence will end up being correct. (In other words, we don't need to worry about multiple threads concurrently accessing the FileStream, but we do need to worry about the same thread issuing multiple async operations to potentially run concurrently.) However, the common case is non-overlapped calls to th epublic XxAsync methods, e.g. the common pattern is So, here’s my thinking on what we do:
The only really interesting case here is the last one, where there’s data in the buffer and not enough remaining space for the data being written. As I see it, there are only three possible options for how to handle this:
I’ve suggested doing (3) above because it only negatively impacts the case where flushes fail (which is rare), we can mitigate it for the case where the flush fails synchronously (in which case we can detect that and can avoid starting the write), and in the case of failure the data in the file probably can’t be considered coherent anyway. @ericstj, @tymlipari, @ianhays, thoughts? |
Delayed synchronous operations at that. The initial WriteAsync with data <= bufferLength when the buffer is empty will complete synchronously after the memcpy into the buffer, yes? But then when a subsequent call comes in that requires a flush, the actual writing of that first WriteAsync is performed synchronously. That an asynchronous write uses the buffer at all seems foreign to me, but to use it synchronously feels downright incorrect.
Why is this the case here but not in (3)? Wouldn't Position be set to the location of the successful flush + length of the write?
To clarify, would the scenario you mention look something like this: buffer size 10 bytes |
Correct. Which I think is fine. The buffer is an optimization that alleviates the need to make sys calls and hit the disk on every write operation.
Correct. That's the problem. Flushing of the buffer is done by writing it out synchronously rather than asynchronously.
It's the difference between (pseudo-code):
and
By the time the first block completes, Position has been updated to reflect the position after the second write. But by the first block completes, flushTask may not have completed and thus we may not have actually called WriteInternalAsync the second time yet, so Position may not be correct.
We'd issue two writes: one for the buffered 8 bytes and one for the caller's 4 bytes. If the one for the buffered 8 bytes fails synchronously (e.g. something goes wrong attempting to initiate the overlapped I/O), we don't try to do the second write, the position remains unchanged, no problem. If the one for the buffered 8 bytes succeeds, it's exactly as it is is today, no problem. If the one for the buffered 8 bytes fails asynchronously, however, then we would have already have issued the second 4 byte write, so the position would not be 12 bytes ahead of where we started rather than the 8 bytes ahead of where we'd started if we'd done the flushing synchronously. |
I see, I was mixing up my sequence of events with (1). I'd argue that having a potentially incorrect Position is a big enough problem to seek a different solution - to not do so would make concurrent async IO operations far more difficult to be sure about. Between (1) and (3), I begrudgingly cast my vote for (3). The edge case in which it fails is narrow enough that I think the benefits gained from asynchronous writes with data.Length < bufferSize are worth the tradeoff. |
@stephentoub I've pushed a new commit that I think handles your cases above. Let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time we call FlushWriteAsync, we do these four lines to update _activeBufferOperation. Should they just be moved into FlushWriteAsync? This is the only place where there's the potential for avoiding it, and it's an uncommon case (an async FileStream being finalized with data in the buffer) and a cheap enough operation that the overhead doesn't seem problematic even if it's unnecessary. Worst case, it could be moved into its own helper rather than repeating it each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just personal preference, so feel free to leave it as-is, but I prefer to write such things as:
_activeBufferOperation = ActiveBufferOperation() ?
Task.WhenAll(writeTask, _activeBufferOperation) :
writeTask;
That way it's clear that we're always assigning into the field, and the conditional just determines exactly what we're assigning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to have a test that tries to issue many writes/flushes to be run concurrently. The test you have and this should also verify that Position reflects the correct value immediately after the synchronous call to the XxAsync method returns. Note that the OS may actually choose to issue writes synchronously, and I believe it'll do so if the write would require extending the size of the file, so it'd be good to have the test verify behavior both with a zero-length file to start and with a file presized for all data needed (this coudl be done as a theory with a Boolean argument that controls whether you start by setting the length). I believe it'll also do writes synchronously for small writes, so it'd be good to also have a test that does large writes, like 100K at a time.
This is looking good, I think. Nice job, @tymlipari. @ericstj, could you please weigh in on the approach and the change overall? @ianhays, thoughts as well? |
Change <= _bufferSize to < _bufferSize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two calls to ActiveBufferOperation() should now be to HasActiveBufferOperation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, looks like I was a bit too eager to push the change. I'm running a quick build and test locally and then I'll push the fix.
Thanks for doing this, @tymlipari. This is a worthwhile change to take, especially since useAsync==true is the default now for FileStream in .NET Core. At the moment, though, this area doesn't have enough test coverage for us to feel confident in merging it (I discovered at least one bug for which I already have a fix locally, specifically the case where multiple concurrent writes are issued while there's an outstanding flush). I want to write a bunch of additional tests for this and do some more inspection of it before we incorporate it. As such, I think the best thing to do at this point is for me to take this over. Once I'm ready, I'll submit a new PR that includes your commits plus additional ones with tests and any necessary code fixes; I'll cc you on that PR. We can leave this PR open as a placeholder until then. Sound ok to you? If you'd prefer an alternate plan, please do let me know. |
Opened #2929 to replace this (squashing and keeping the commits from this PR). Thanks, @tymlipari! |
Handle internal overflow of the write buffer in WriteAsync asynchronously by introducing new cases:
Introduces new cases to handle overflow of the internal write buffer rather than synchronously flushing the buffer. In the best case, it'll create a new buffer and write the original one to disk (single write). In the worst case, it will chain the incoming write to the original flush operation (double write). All writes are now done asynchronously. Additionally, adds a new unit test to ensure the different buffering cases are being covered.
Fix #1531