-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Integration testing for downstream dependencies #15992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A cron job or something like then one you proposed would make the trick. But if a change is mandatory the maintainers need somehow to be notified early, no? But I suppose that is for internal use and to detect a potential impact of a change to the downstream libs. |
Frankly the downstream packages should be doing this integration testing by
testing various use cases and checking with scikit-learn master. We should
be encouraging downstream packages to include such integration tests (e.g.
various usages of their estimators with grid search).
But it's reasonable that we at least run a suite of integration tests as
part of a release cycle. Putting this in Azure makes it easier for our
maintainers.
I don't mind if we also run a Cron job but then we will have to deal with
more false alarms (e.g. downstream test suites failing due to poor or
delayed management of their other dependencies).
|
|
Completely agree that it's our task downstream to test against sklearn master, personally it's on my list for a while (with a disturbingly old WIP PR), I just didn't get to the point to set up the build with the nightly build rather than pip install (which takes for ages). |
I'd be happy to see, in the first instance, a script that runs the test suites of these downstream packages. We can look separately at how we best deploy that on CI |
I think this is reasonable.
@bsipocz This is something that we do in
I would suggest the packages from the |
I have started a integration testing library to start addressing this issue: https://github.com/thomasjpfan/scikit-learn-integration-testing |
Added to the 1.0 milestone so that we do not forget about it. |
Makes sense. I am thinking of moving the CI to github actions so I can integrate it into github discussions or issues to ping the maintainers of the upstream projects when there is an test failure. |
Seems like we did forget about it @ogrisel 😁 . With some of the changes coming in the next release, and the one after, this would be really nice to have though. Moving to 1.1 for now. |
I think that we can address the following when we have a more granular |
very unlikely to be part of 1.1, moving to 1.2 |
I think https://github.com/scientific-python/reverse-dependency-testing from @martinfleis is a good candidate to test downstream dependencies. What do you think? |
@jjerphan it may work for a subset of packages out of the box. Packages organised like imbalanced-learn will work since they have tests included inside the package itself. But for example scikit-matter will not as it does not ship tests. For the latter, we shall find another way of getting the tests within the action but I haven't looked into that yet. |
As a workaround, is there any way that the action could git clone the repo (assuming it's provided and is not a huge pile of hack to get the project links from PyPI)? Or we, as a community, should just strongly recommend including the test suite in the packaging, too? |
That is not complicated, just not implemented at the moment.
I don't think so. @henryiii was advising against that (though I personally do see a certain value in that) while suggesting that tests should be in the sdist. We should then fetch the sdist from PyPI and run the tests. Though in practice, I am not sure how common that practice is. I bet that it is currently more likely that tests are shipped with the package than that they're in the sdist and not in the package. |
Some packages that have a top-level "tests" that is not in the wheel:
Also most (or all) of Scikit-HEP. Every package I'm involved with uses this structure. |
I feel the ones listed above are more generic, and part of the basic infrastructure than the end-user libraries we try to tackle with downstream testing. E.g. occasionally, we found it rather useful to be able to ask the users to run Similarly, here for downstream testing it could be useful to be super easily just run the tests. |
I'm not against Also, pytest's config doesn't ship in the wheel. PS: Those "simple" packages are in the top 10 or so most downloaded packages or from the PyPA, by the way. |
Oh, we fully agree on most of the things. Also, I didn't mean "simple" to say they are not important, those are indeed the most important infrastructure in the whole ecosystem. However, I feel this issue about downstream dependencies are targeting less well-engineered or maintained projects to see the effect of deprecation and changes introduced in core projects, which in practice means testing not those most relied-on libraries but those further down the chain, so the design choices of the core infrastructure libraries have little relevance. |
We had some IRL discussions with @jeremiedbb regarding some expectations. I am not sure that getting the latest release from PyPI is actually enough. I would expect to use the If we want to get the |
We need a new Continuous Integration configuration to launch the tests of some popular downstream projects.
Candidate projects:
Those test would not be run in PRs unless we use a specific tag in the commit message for instance. This test setup would be run by maintainers in the release branches.
We do something similar in the cloudpickle project with the
[ci downstream]
commit tag:https://github.com/cloudpipe/cloudpickle/blob/master/.travis.yml#L37-L70
For scikit-learn this would probably need a bit more scripting as the build is more complex than for a pure python project such as cloudpickle but the general idea would be the same.
The text was updated successfully, but these errors were encountered: