Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Appveyor tests are being flaky #3895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
emmatyping opened this issue Aug 31, 2017 · 22 comments
Closed

Appveyor tests are being flaky #3895

emmatyping opened this issue Aug 31, 2017 · 22 comments

Comments

@emmatyping
Copy link
Member

This was first brought up in #3846. Moved here to avoid clutter in the PR.

You can see the results here: https://ci.appveyor.com/project/JukkaL/mypy/build/1.0.1042

The relevant exception is OSError: [Errno 9] Bad file descriptor which I believe is from pytest trying to read from stdout, however for some reason stdout is either locked, unavailable, or Appveyor is incorrectly reporting the handle of stdout.

@elazarg chimed in

IIRC it seemed to happen when multiple instances of pytest were active. It stopped happening after I've changed the PR to use only a single instance. I'm not sure if this information helps though...

If true, AIUI, this should explain the issue. However, Im not sure we currently spawn multiple pytest instances concurrently, so this may indicate our scheduling logic is broken, or it is an entirely different issue.

@gvanrossum
Copy link
Member

What makes you think it's related to stdout? The tracebacks all seem to point to this line:

    def resume(self):
        self.syscapture.resume()
>       os.dup2(self.tmpfile_fd, self.targetfd)

This does appear to be part of a pytest internal class (FDCapture.resume()).

@emmatyping
Copy link
Member Author

The previous stack is

    def resume_capturing(self):
        if self.out:
            self.out.resume()
        if self.err:
>           self.err.resume()

I suppose stderr is more accurate, my apologies, but the effect is the same. This is part of pytest capturing ouput to the console.

@emmatyping
Copy link
Member Author

Apparently PyInstaller had similar issues with pytest-xdist, so I believe that we can put the source on issues with pytest-xdist's scheduling.

At pybay @pkch and I saw an interesting talk about pytest-concurrent. It may be too early on for a switch, but it might be nice to use in the future. Eg "allowing certain tests to be grouped so that they are executed sequentially".

@emmatyping
Copy link
Member Author

Scrolling through twitter this morning I also ran into this https://twitter.com/pumpichank/status/903280328978178048

So maybe its Appveyor flakiness? Hard to tell.

@emmatyping
Copy link
Member Author

Since it has been 3 weeks, I think (I hope) this can be safe to close.

@elazarg
Copy link
Contributor

elazarg commented Sep 23, 2017

It happened again in #3973

@emmatyping
Copy link
Member Author

Darn. Okay, I think that eliminates it being Appveyor being flaky. Seems that our entire test suite needs to be tested. (see also #3975).

@emmatyping emmatyping reopened this Sep 23, 2017
@elazarg
Copy link
Contributor

elazarg commented Sep 23, 2017

@ethanhs see my recent comment there. If my comment is related to the bug, then it seems to be unrelated to appveyor flakiness - or at least not precisely the same problem. Besides, the problems with appveyor only began after my PR (#3870), whereas the problems with Travis came up earlier. What do you think?

@emmatyping
Copy link
Member Author

@elazarg The timeout issue seems quite plausible as the issue for Travis, that is a good find! However, the Appveyor flakes seem to be something different. Unless timeouts cause issues with streams, I don't see how the timed out test could cause issues with the duplication of the stderr stream (which I believe is the symptom the issue is causing).

@gvanrossum
Copy link
Member

This is slowing down our progress. @ethanhs have you thought about this more?

@emmatyping
Copy link
Member Author

Only a little. Since travis is having timeout issues, my initial hunch was that it was related to that however I have eliminated timeouts from being the root of the issue I believe, as if I force tests to time out by spawning an inordinately large number of processes it does not cause the errors we are seeing. I looked at the relevant source of pytest and it appears that the failure is on trying to copy the file descriptor from a temp file used to capture output back to stderr. My hunch is something is causing the file to be cleaned up prematurely, thus invalidating the file descriptor. I don't yet have an idea of why, but that is what I will look at next.

@gvanrossum
Copy link
Member

Wow, I keep looking at that log and getting pulled in. At the very end there are two "normal" failures, and the second of these seems to be a clue: testAbs2_python2. It's failing with the bad fd error in snap(), and the first traceback is in the same test. Looking at a more recent failure log we see the same pattern for testGenericMatch_python2. In any case all the other errors are just cascading from this -- the tmpfile on the FDCapture object got closed unexpectedly and then everything breaks. (IOW I think the fd copy failure is secondary too, while self.tmpfile being closed is primary.)

@gvanrossum
Copy link
Member

I just had another failure like this, and it failed on testAbs2_python2 again. There's nothing special about that test except that it's the first test in test-data/unit/python2eval.test. And testGenericMatch_python2 is the 11th. Coincidence? Maybe we should pass a process limit to runtest in appveyor.yml too, like we recently added to .travis.yml?

@emmatyping
Copy link
Member Author

I don't think that will help. As I previously mentioned, I simulated a high process/thread ratio (the issue with travis) by spawning a hundred processes on my 12 thread CPU. Nothing happened except a few tests timed out (empty output).

@gvanrossum
Copy link
Member

gvanrossum commented Sep 26, 2017 via email

@gvanrossum
Copy link
Member

If all else fails maybe we should just skip the python2eval tests in the AppVeyor script, it's never found anything Windows-specific.

@emmatyping
Copy link
Member Author

Digging a little bit more.

Our tests (python eval tests specifically) use subprocesses to capture stdout and stderr. However, the subprocess module uses OS level interactions to open and close handles to these file descriptors. I was pretty sure that the handle was being closed by subprocess. Based on this pull request and this change set, I can surmise that is indeed quite plausible. In these examples, the issue is claimed that the use of sys.std(out/err) is the issue, however, we use PIPE and get the exact same result (also using pytest-xdist). So I believe there is something else going on.

I'm not certain exactly what however. Is it perhaps that we don't call communicate on the process after killing it? According to https://github.com/python/cpython/blob/master/Lib/subprocess.py#L1081 the fd stays open until another communicate call.

@gvanrossum
Copy link
Member

There's something magical about the number 10, apparently. I got another failure and the failing test, testTupleAsSubtypeOfSequence_python2, is number 21 in the file.

emmatyping added a commit to emmatyping/mypy that referenced this issue Sep 27, 2017
This replaces the old subprocess based method. In addition, the test cases
are no longer run, in order to reduce test time. This has led to a 10-20%
speedup in pytest. This might help with python#3895, as it removes the
subprocesses.

Fixes python#1671
@gvanrossum
Copy link
Member

Jukka and I debated this briefly offline. We decided that it would be simplest to try and get this off our backs by no longer running the python2eval tests on AppVeyor -- they probably don't bring much of interest to the table. Perhaps the easiest way to accomplish this would be to not install Python 2, since then runtests.py wil automatically skip those tests. (But how?) Or we could just add some "-x" flag to the runtests.py call in appveyor.yml.

@emmatyping
Copy link
Member Author

I'd much rather have the tests excluded in the AppVeyor config file, I have yet to have the same issue locally, so I see no reason to exclude the tests locally on Windows.

@gvanrossum
Copy link
Member

OK, can you submit a diff for that?

@emmatyping
Copy link
Member Author

I will do that when I am back in front of a computer in an hour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants