Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@kpritam
Copy link
Contributor

@kpritam kpritam commented May 20, 2024

/claim #8792

tapSink usage merge with a HaltStrategy.Both, that should guarantee execution of both sides (left and right) to the completion.

This was the regression introduced in this PR: #8311

@varshith257
Copy link
Contributor

varshith257 commented May 20, 2024

@kpritam I am not authorised person to review but I have also made some trial&errors with tapsink to check it's behaviour.

Changing from forkIn(scope) to forkDaemon might help the test pass, but it can introduce unpredictability and resource management issues.The original use of forkIn(scope) ensures that all resources are cleaned up i.e.. queues etc.. predictably within the scope. This is crucial for reliable and consistent behavior of tapSink what it introduced from Daemon -> Scope in #8311

Using forkDaemon can lead to incomplete processing and resource leaks, as it doesn't guarantee

@varshith257
Copy link
Contributor

varshith257 commented May 21, 2024

  • forkIn(scope): When a fiber is forked with forkIn(scope), it is tied to the lifecycle of the specified scope. This means that the fiber finalization will be awaited as part of the scope finalization and also ensures that resources are cleaned up predictably within the scope. This is crucial for avoiding resource leaks and ensuring that all side effects (such as updating a queue) are completed before the scope ends.

  • forkDaemon: When a fiber is forked with forkDaemon, it runs independently of the scope that created it. The main process does not wait for daemon fibers to complete their work. There is no guarantee that resources will be cleaned up before the scope ends. This can lead to unpredictable behavior and resource leaks.

cc/ @eyalfa This the behaviour what I expect of Scope and Daemon does

@kpritam
Copy link
Contributor Author

kpritam commented May 21, 2024

varshith257 Bunch of things to unfold here, for the record I understand the difference between forkIn(scope) and forkDaemon quite well and usage for forkDaemon here is not to just make test pass but is intentional.

This is my understanding based on my limited knowledge of the codebase 😉 :

  • Primary objective of PR Ensure Queue Will Be Shutdown Before Awaiting It In ZStream#tapSink #8311 was to fix memory leak with tapSink implementation, these changes are retained, queues are properly shutdown in tapSink as before.
  • Channels are properly closed using (queueReader >>> self).toPullIn(scope) & (queueReader >>> that).toPullIn(scope)
  • Note that I wanted to use pullL.fork.zipWith(pullR.fork) that pass slow sink test but preserves scope of inner fibers test becomes flaky which is why I had to fallback to original implementation which was using forkDaemon.
  • Also note that tapSink was providing guarantees to execute the both sides to completion before the introduction of forkIn(scope) which tells me that ZStream.tapSink, either a flaky test or flaky implementation #8792 is indeed an valid bug.

@eyalfa
Copy link
Contributor

eyalfa commented May 21, 2024

@kpritam I am not authorised person to review but I have also made some trial&errors with tapsink to check it's behaviour.

Changing from forkIn(scope) to forkDaemon might help the test pass, but it can introduce unpredictability and resource management issues.The original use of forkIn(scope) ensures that all resources are cleaned up i.e.. queues etc.. predictably within the scope. This is crucial for reliable and consistent behavior of tapSink what it introduced from Daemon -> Scope in #8311

Using forkDaemon can lead to incomplete processing and resource leaks, as it doesn't guarantee

I was about to submit this exact same comment myself

@eyalfa
Copy link
Contributor

eyalfa commented May 21, 2024

@kpritam I tend to categorize this as a bug as well, can't really see how changing to fork daemon guarantees anything, I suspect it just changes the probabilities.
I'm not sure why @adamgfraser changed from fork to forkScope in the first place (might have my thoughts on this but don't have the time to dive this deep into the code atm) but I wouldn't change it before having a full understanding of the effect of fibers runtime scope on the merge operator.

the way I see it the issue is that the finalization code is not waiting for the fiber running the sink, so the trick is first to make sure it can complete (by offering the final end marker into the queue) and then wait for the sink fiber to complete. however... this is not always the correct behavior, in case of stream interruption or failure we want to interrupt the sink as well (unless there's a requirement to guarantee the sink sees all successful elements...)
I think the merge based implementation basically attempts to achieve just this and it does so in most cases, simply because it effectively awaits the sink fiber upon successful completion before finalization, so this await is interruptible.
What happens when downstream cancels 'early' is that the upstream is never pulled again and its finalizer is invoked, this is when the fiber sink gets interrupted. identifying this is a bit unpleasant as it requires the implementation to keep track of upstream completion/failure/interruption and try to figure out why was the finalizer invoked

@jdegoes
Copy link
Member

jdegoes commented May 21, 2024

I think the correct fix to this will come about through deeper understanding of the underlying race condition causing the bug.

@jdegoes jdegoes closed this May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants