Handle internal overflow of the write buffer in WriteAsync by introdu… #2802

tymlipari · 2015-08-14T05:18:54Z

Handle internal overflow of the write buffer in WriteAsync asynchronously by introducing new cases:

Introduces new cases to handle overflow of the internal write buffer rather than synchronously flushing the buffer. In the best case, it'll create a new buffer and write the original one to disk (single write). In the worst case, it will chain the incoming write to the original flush operation (double write). All writes are now done asynchronously. Additionally, adds a new unit test to ensure the different buffering cases are being covered.

Fix #1531

…cing new cases: 1) Internal buffer overflows, but write can fit in next buffer - Fill the existing buffer, copy the remaining data into a new buffer, and flush the original to disk 2) Internal buffer is non-empty, overflows, and the remaining data won't fit in a second buffer - Chain the flush operation to a second write operation which writes the entire incoming block directly to disk. Additionally, adds a new unit test to ensure the different buffering cases are being covered.

dnfclas · 2015-08-14T05:18:58Z

Hi @tymlipari, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla2.dotnetfoundation.org.

TTYL, DNFBOT;

dnfclas · 2015-08-14T05:21:52Z

@tymlipari, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR.

Thanks, DNFBOT;

TheRealPiotrP · 2015-08-14T07:35:15Z

src/System.IO.FileSystem/src/System/IO/Win32FileStream.cs

Why are we not passing through the cancellationToken param?

Whoops, likely a copy-paste error. Fixed in updated commit.

stephentoub · 2015-08-14T20:22:29Z

src/System.IO.FileSystem/src/System/IO/Win32FileStream.cs

Chaining like this is the right idea. However, using ContinueWith as is done is going to be problematic. ContinueWith executes the continuation regardless of whether the antecedent task completes successfully, due to cancellation, or due to faulting with an exception, and we'd only want to do the subsequent write if the flush completed successfully. If it didn't, we want to return the status from the FlushWriteAsync task instead of that of executing the write. The semantics of await are more appropriate here. As such, you probably want a helper async method that awaits the flush and then awaits the write, and that helper can be invoked here.

…ntinueWith

stephentoub · 2015-08-18T17:00:22Z

I thought more about the whole buffering issue.

Seems like we have a few goals here:
A. Minimize the blocking employed, e.g. not synchronously flushing during an asynchronous write
B. Minimize the allocation employed, e.g. not creating a new buffer every time we flush

We also need to maintain existing behaviors as much as possible. For example, in going over the code, it’s clear that it goes out of its way to maintain the current Position, so that multiple asynchronous writes can be issued one after the other to execute concurrently, such that the resulting sequence will end up being correct. (In other words, we don't need to worry about multiple threads concurrently accessing the FileStream, but we do need to worry about the same thread issuing multiple async operations to potentially run concurrently.)

However, the common case is non-overlapped calls to th epublic XxAsync methods, e.g. the common pattern is await fileStream.XxAsync(…), such that no additional operations are issued on the FileStream instance until the previous operation completes. That’s what we should be optimizing for, while still supporting the Task t1 = fileStream.XxAsync(); Task t2 = fileStream.XxAsync(); await Task.WhenAll(t1, t2); case.

So, here’s my thinking on what we do:

We keep track of a single Task field on the FileStream used to determine whether any async operation is still in progress that may involve the internal buffer. Any time an async flush operation is issued, this field gets updated. If the field is null or if the Task in the field has completed (doesn't matter if successfully or not), we simply store the created task into the field. If the Task in the field is not completed, we store “Task.WhenAll(_field, newTask)” into the field. At least for now, this serves purely to help with buffer management: if we know there’s no asynchronous flush in progress, then we don’t need to allocate a new internal buffer.
When we WriteAsync:

If the data fits in the remaining space in the internal buffer, and if f the tracking task has completed, we simply store it into the buffer synchronously, as we do today. This is very efficient. No extra allocations, no extra async ops.
If the data fits in the remaining space in the internal buffer, but the tracking task has not yet completed, there’s a flush async operation still in progress, which means we can’t touch the internal buffer. So we allocate a new buffer to overwrite the previous one and store the data into the buffer. This does involve an extra buffer allocation, but it’s the uncommon case of scheduling an async write when there's a previously issued async flush still in progress.
If the data doesn’t fit in the remaining space in the internal buffer and the buffer is empty, the data being written is larger than the whole internal buffer, and we simply do the asynchronous write, as we do today.
If the data doesn't fit in the remaining space in the internal buffer and the buffer has data in it, we issue the asynchronous flush to empty the buffer. Once we have that flush’s task (and we know the Position has been updated accordingly), assuming it didn’t synchronously fail, we issue the internal WriteAsync for the user provided data. These operations are now running concurrently, but doing it this way (rather than using a continuation off of the flush task) means that when returning from this write operation, the Position will have been updated accordingly. The only difficulty here is that we now have two tasks. We can return a Task.WhenAll that combines the two operations. We don't need to proactively allocate a new buffer in this case, becuase the write operation isn't touching the buffer: we simply pass the user's data directly through.

The only really interesting case here is the last one, where there’s data in the buffer and not enough remaining space for the data being written. As I see it, there are only three possible options for how to handle this:

Synchronously flush the buffer, then do the write asynchronously. This is what the code does today. This has the significant downside that all WriteAsync calls with data <= bufferLength essentially become Write calls, making everything synchronous. Ugh!
Issue the flush asynchronously, and when it completes, issue the write asynchronously. This would be ideal, except that when the WriteAsync call returns to its synchronous caller, the Position may not accurately reflect where the next operation should be writing data. This breaks the previously discussed invariant.
Issue the flush asynchronously, and then regardless of whether it’s completed, issue the write asynchronously. This has the downside that in the case where the flush fails asynchronously, we may have updated the position for the subsequent write in a case where today we wouldn’t have.

I’ve suggested doing (3) above because it only negatively impacts the case where flushes fail (which is rare), we can mitigate it for the case where the flush fails synchronously (in which case we can detect that and can avoid starting the write), and in the case of failure the data in the file probably can’t be considered coherent anyway.

@ericstj, @tymlipari, @ianhays, thoughts?

ianhays · 2015-08-18T21:46:18Z

1 Synchronously flush the buffer, then do the write asynchronously. This is what the code does today. This has the significant downside that all WriteAsync calls with data <= bufferLength essentially become Write calls, making everything synchronous.

Delayed synchronous operations at that. The initial WriteAsync with data <= bufferLength when the buffer is empty will complete synchronously after the memcpy into the buffer, yes? But then when a subsequent call comes in that requires a flush, the actual writing of that first WriteAsync is performed synchronously.

That an asynchronous write uses the buffer at all seems foreign to me, but to use it synchronously feels downright incorrect.

2 ...except that when the WriteAsync call returns to its synchronous caller, the Position may not accurately reflect where the next operation should be writing data.

Why is this the case here but not in (3)? Wouldn't Position be set to the location of the successful flush + length of the write?

3 Issue the flush asynchronously, and then regardless of whether it’s completed, issue the write asynchronously. This has the downside that in the case where the flush fails asynchronously, we may have updated the position for the subsequent write in a case where today we wouldn’t have.

To clarify, would the scenario you mention look something like this:

buffer size 10 bytes
position = 0
WriteAsync(4 bytes @ position 0) - copies to buffer and returns completed Task;
position = 4
WriteAsync(4 bytes @ position 4) - copies to buffer and returns completed Task
position = 8
WriteAsync(4 bytes @ position 8) - runs asynchronously alongside the flushing of the buffer filled with data from the first two WriteAsync calls. But if the Flush fails, then the position of this third write will be different here than if the first two Writes were already flushed before the call.

stephentoub · 2015-08-18T22:01:05Z

The initial WriteAsync with data <= bufferLength when the buffer is empty will complete synchronously after the memcpy into the buffer, yes?

Correct. Which I think is fine. The buffer is an optimization that alleviates the need to make sys calls and hit the disk on every write operation.

But then when a subsequent call comes in that requires a flush, the actual writing of that first WriteAsync is performed synchronously.

Correct. That's the problem. Flushing of the buffer is done by writing it out synchronously rather than asynchronously.

Why is this the case here but not in (3)? Wouldn't Position be set to the location of the successful flush + length of the write?

It's the difference between (pseudo-code):

Task flushTask = WriteInternalAsync(_buffer, 0, _bufferedData);
Task writeTask = WriteInternalAsync(data, offset, count);

and

Task flushTask = WriteInternalAsync(_buffer, 0, _bufferedData);
Task writeTask = flushTask.ContinueWith(t => WriteinternalAsync(data, offset, count);

By the time the first block completes, Position has been updated to reflect the position after the second write. But by the first block completes, flushTask may not have completed and thus we may not have actually called WriteInternalAsync the second time yet, so Position may not be correct.

But if the Flush fails, then the position of this third write will be different here than if the first two Writes were already flushed before the call.

We'd issue two writes: one for the buffered 8 bytes and one for the caller's 4 bytes. If the one for the buffered 8 bytes fails synchronously (e.g. something goes wrong attempting to initiate the overlapped I/O), we don't try to do the second write, the position remains unchanged, no problem. If the one for the buffered 8 bytes succeeds, it's exactly as it is is today, no problem. If the one for the buffered 8 bytes fails asynchronously, however, then we would have already have issued the second 4 byte write, so the position would not be 12 bytes ahead of where we started rather than the 8 bytes ahead of where we'd started if we'd done the flushing synchronously.

ianhays · 2015-08-18T22:35:46Z

But by the first block completes, flushTask may not have completed and thus we may not have actually called WriteInternalAsync the second time yet, so Position may not be correct.

I see, I was mixing up my sequence of events with (1).

I'd argue that having a potentially incorrect Position is a big enough problem to seek a different solution - to not do so would make concurrent async IO operations far more difficult to be sure about.

Between (1) and (3), I begrudgingly cast my vote for (3). The edge case in which it fails is narrow enough that I think the benefits gained from asynchronous writes with data.Length < bufferSize are worth the tradeoff.

…ize buffer allocations.

tymlipari · 2015-08-19T21:38:41Z

@stephentoub I've pushed a new commit that I think handles your cases above. Let me know what you think.

stephentoub · 2015-08-20T02:52:02Z

src/System.IO.FileSystem/src/System/IO/Win32FileStream.cs

Every time we call FlushWriteAsync, we do these four lines to update _activeBufferOperation. Should they just be moved into FlushWriteAsync? This is the only place where there's the potential for avoiding it, and it's an uncommon case (an async FileStream being finalized with data in the buffer) and a cheap enough operation that the overhead doesn't seem problematic even if it's unnecessary. Worst case, it could be moved into its own helper rather than repeating it each time.

This is just personal preference, so feel free to leave it as-is, but I prefer to write such things as:

_activeBufferOperation = ActiveBufferOperation() ? Task.WhenAll(writeTask, _activeBufferOperation) : writeTask;

That way it's clear that we're always assigning into the field, and the conditional just determines exactly what we're assigning.

stephentoub · 2015-08-20T03:29:20Z

src/System.IO.FileSystem/tests/FileStream/WriteAsync.cs

It'd be good to have a test that tries to issue many writes/flushes to be run concurrently. The test you have and this should also verify that Position reflects the correct value immediately after the synchronous call to the XxAsync method returns. Note that the OS may actually choose to issue writes synchronously, and I believe it'll do so if the write would require extending the size of the file, so it'd be good to have the test verify behavior both with a zero-length file to start and with a file presized for all data needed (this coudl be done as a theory with a Boolean argument that controls whether you start by setting the length). I believe it'll also do writes synchronously for small writes, so it'd be good to also have a test that does large writes, like 100K at a time.

stephentoub · 2015-08-20T03:30:54Z

This is looking good, I think. Nice job, @tymlipari.

@ericstj, could you please weigh in on the approach and the change overall?

@ianhays, thoughts as well?

Change <= _bufferSize to < _bufferSize

stephentoub · 2015-08-20T04:09:12Z

src/System.IO.FileSystem/src/System/IO/Win32FileStream.cs

These two calls to ActiveBufferOperation() should now be to HasActiveBufferOperation.

Yup, looks like I was a bit too eager to push the change. I'm running a quick build and test locally and then I'll push the fix.

stephentoub · 2015-08-20T20:25:00Z

Thanks for doing this, @tymlipari.

This is a worthwhile change to take, especially since useAsync==true is the default now for FileStream in .NET Core. At the moment, though, this area doesn't have enough test coverage for us to feel confident in merging it (I discovered at least one bug for which I already have a fix locally, specifically the case where multiple concurrent writes are issued while there's an outstanding flush).

I want to write a bunch of additional tests for this and do some more inspection of it before we incorporate it. As such, I think the best thing to do at this point is for me to take this over. Once I'm ready, I'll submit a new PR that includes your commits plus additional ones with tests and any necessary code fixes; I'll cc you on that PR. We can leave this PR open as a placeholder until then.

Sound ok to you? If you'd prefer an alternate plan, please do let me know.

stephentoub · 2015-08-21T21:52:23Z

Opened #2929 to replace this (squashing and keeping the commits from this PR). Thanks, @tymlipari!

dnfclas added the cla-required label Aug 14, 2015

dnfclas added cla-signed and removed cla-required labels Aug 14, 2015

TheRealPiotrP reviewed Aug 14, 2015
View reviewed changes

Use cancellation token in FlushWriteAsync

17c5fc0

stephentoub reviewed Aug 14, 2015
View reviewed changes

Flush and write operation as helper function instead of using Task.Co…

5e05efc

…ntinueWith

tymlipari added 2 commits August 19, 2015 14:32

Change buffering logic to track flush task activity in order to minim…

84a9ce4

…ize buffer allocations.

Missed case for tracking async flush operations

cfeefdb

stephentoub reviewed Aug 20, 2015
View reviewed changes

Respond to code review feedback.

21da6c0

stephentoub reviewed Aug 20, 2015
View reviewed changes

Fix build break

521196b

Change <= _bufferSize to < _bufferSize

stephentoub reviewed Aug 20, 2015
View reviewed changes

stephentoub mentioned this pull request Aug 21, 2015

Improve FileStream WriteAsync buffering and performance #2929

Merged

stephentoub closed this Aug 21, 2015

karelz modified the milestone: 1.0.0-rtm Dec 3, 2016

Handle internal overflow of the write buffer in WriteAsync by introdu… #2802

Handle internal overflow of the write buffer in WriteAsync by introdu… #2802

Uh oh!

Conversation

tymlipari commented Aug 14, 2015

Uh oh!

dnfclas commented Aug 14, 2015

Uh oh!

dnfclas commented Aug 14, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stephentoub commented Aug 18, 2015

Uh oh!

ianhays commented Aug 18, 2015

Uh oh!

stephentoub commented Aug 18, 2015

Uh oh!

ianhays commented Aug 18, 2015

Uh oh!

tymlipari commented Aug 19, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stephentoub commented Aug 20, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stephentoub commented Aug 20, 2015

Uh oh!

stephentoub commented Aug 21, 2015

Uh oh!

Uh oh!