Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Feb 24, 2020. It is now read-only.

Conversation

@steveej
Copy link
Contributor

@steveej steveej commented Oct 14, 2015

Closes #1595.
Fixes #1590.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have several issues with GoroutineAssistant's Fatalf:

  • The s channel should have a buffer (say of two elements), so sending a message there is not blocked.
  • It should not call Done() on WaitGroup - it is done in every goroutine as a deferred action. The problem was that this deferred action might not be executed, because of being blocked by sending an error to blocking channel.
  • The return directives were removed after calls to Fatalf in goroutines, so it should imply that Fatalf does not return (it should call runtime.Goexit()).

@jonboulle
Copy link
Contributor

Perhaps we need to start thinking about it as a gexpect helper, and use Close during shutdown.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two of three issues I wrote before are still not addressed.

  • This function is not supposed to return - it should call runtime.Goexit() as the last thing in its body. We assume this function does not return in other places (we removed return clauses).
  • I think that a.s channel should be non-blocking (as in s: make(chan error, 10)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder. I added the call to runtime.Goexit(). I'm not convinced to make the channel buffered, that would make the code more complex to allow dynamic channel size

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just make it fixed at 10 and that's it. For now go does not allow to grow channel's buffer. For now we don't use too many goroutines on single assistant (2 at most), so it should be enough in the forseeable future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exactly would you rather have a buffered channel?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have it thought out with unbuffered channel to take advantage of the synchronizing nature of these. This will force us to think about the written test a little more and improve the quality. Buffered channels lead to fire&forget runs, and we'd need another mechanism to synchronize/stop failed tests(routines).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending to this channel is usually the last action goroutine does explicitly (that is - not counting the deferred actions).

I wanted to have a buffered channel, just to let the goroutine to die quickly, without waiting for the receiving side of channel, so deferred actions can be executed immediately.

@steveej steveej self-assigned this Oct 14, 2015
@steveej steveej added this to the v0.10.0 milestone Oct 14, 2015
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will deadlock if spawning fails - it will try to send an error over a blocking channel which is not listened yet by anyone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we're not supposed to use ga outside of the gouroutines, so SpawnOrFail is not supposed to be used here.

@jonboulle
Copy link
Contributor

I tend to agree with krnowak that ultimately this should live in the context, but I think this is OK as a stopgap for now.

@steveej
Copy link
Contributor Author

steveej commented Oct 16, 2015

@krnowak PTAL. rktRunCtx now takes care of registered child processes. It does add a little overhead to kill and wait every child process, but that's a small price to pay for correctness.
I've added one usage example in the TestNetHost* tests and we could use this in all tests that spawn children.

@steveej steveej changed the title testutils: fix GoroutineAssistant and httputils #1595 testutils: fix GoroutineAssistant and httputils Oct 16, 2015
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same thing has to be done also in reset function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe put this into runGC? But that might be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not assumed that ctx.cleanup() is called before ctx.reset()? does it make sense to run reset() without cleanup()? if not, we should have a variable for tracking the context's state and run cleanup() if only reset() is invoked.

@krnowak
Copy link
Collaborator

krnowak commented Oct 16, 2015

The RegisterChild is good for now. Maybe in the follow-up PR we could modify the spawnOrFail function to take the context and register the child immediately.

Also, is the commit "tests/net: one call to ga.Add(1) per goroutine" needed? I suppose we can drop it.

jonboulle and others added 8 commits October 16, 2015 19:33
Use a single channel for shutting down GoroutineAssistant, of type error
instead of string; in this way, Done() and Fatalf() are serialized and
can't race.

Fixes #1590
Calling log.Fatal would circumvent proper test shutdown/error
propagation. All callers are already checking for the error
appropriately.
After a GoroutineAssistant is initialised during tests, all operations
that might end in a testing.Fatal should be serialized through it.
The test ACI server closes both Msg and Stop channels on Close(). The
Stop channel never receives any messages - it is only used to listen
for service shutdown. The channel listening is done in serverHandler,
which is run in its own goroutine. The Stop channel was used shortly
before, but I wrongly removed it during review process. Because of
that, the function never returned and the goroutine was unnecessarily
kept alive.
Otherwise there are issues when cleaning up the rkt context.
Since the tests that use the inspect binary rely on stdout parsing,
printing this too early will cause a race condition between the serve
and test code. Simply moving the print to a point after the Listen()
call prevents that.
* don't branch since Fatalf will not return
* exit the calling goroutine in Fatalf
When tests spawn children and then cause a failure, they don't have the
chance to wait for the children to complete. These tests are supposed to
use the ctx.RegisterChild function for every child so that ctx.cleanup()
will be able to handle child shutdown.
@krnowak
Copy link
Collaborator

krnowak commented Oct 16, 2015

LFAD if green.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the errors from these two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could pass it on and then handle (ignore) it in the invoking function, would you prefer that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest doing a nominal check and just logging for information purposes. I'm just trying to future-proof us against the kind of murky mad situation that got us here in the first place.

If you feel strongly against that, let's at least move to the better practice/style of explicitly ignoring the error:
_ = child.Cmd.Process.Wait()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with logging the error. If it was just me, we'd be logging a lot more.

@jonboulle
Copy link
Contributor

@krnowak
Copy link
Collaborator

krnowak commented Oct 19, 2015

This PR is dragging on and on and is gathering more and more changes. I'd like to merge ASAP, but first some of my notes/questions first:

But can we please merge it already? It fixes those pesky failures in networking tests and we are good. I'd prefer to address the other issues (no leftover goroutines, agree on logging solution, proper rkt registration for all tests and so on) in separate, uh, issues. Github issues.

@jonboulle
Copy link
Contributor

SGTM

@steveej steveej mentioned this pull request Oct 19, 2015
4 tasks
steveej added a commit that referenced this pull request Oct 19, 2015
 testutils: fix GoroutineAssistant and httputils
@steveej steveej merged commit 1d21d76 into rkt:master Oct 19, 2015
@steveej steveej deleted the goassistant branch October 19, 2015 19:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants