Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] Include pxd-files into the installation #14896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

realead
Copy link

@realead realead commented Sep 5, 2019

Reference Issues/PRs

Fixes #14847

What does this implement/fix? Explain your changes.

All pxds are now copied to the installation, so they can be reused for building Cython-extension by the user.

pyximport is used to compile the test-pyx-files during the runtime. The rebuild is forced so the cache isn't used.

Any other comments?

I didn't include pxd-files from sklearn.svm (libsvm.pxd and liblinear.pxd), because in order to build them not only the h-files from src-folder are needed, but also c and cpp files, because of this include:

cdef extern from "libsvm_helper.c":
   ....

which leaks all definitions into the cythonized cpp-file, so the linker needs everything from cpp in order to be able to resolve all those leaked symbols.

If these pxd files should be included into the installation as well, first one needs to introduce libsvm_helper.h and build libsvm_helper.c as a source file and be including it into the pxd (and the same for liblinear.pxd)

@realead
Copy link
Author

realead commented Sep 5, 2019

I have some questions:

  • pyximport uses ./pyxbld for caching the results of the build. The rebuild is enforced, but still some data stays in the cache (it shouldn't be a problem, but you never know) Is this a problem? It is possible to put the build-cache in any other directory which can be wipped after the run. Is there already such a place?

@realead realead force-pushed the fix_gh14847 branch 2 times, most recently from b38ae97 to 59851bd Compare September 11, 2019 19:07
Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this @realead !

It's a thorough approach, but I wonder if we can get away with avoiding compilation tests. In particular, adding the relevant files to Manifest.in and then add a single test that checks for the includes pxd somewere in,

sklearn/tests/test_check_build.py

i.e. that the following files exist,

$ find sklearn -iname '*.pxd'
sklearn/ensemble/_hist_gradient_boosting/common.pxd
sklearn/linear_model/sgd_fast.pxd
sklearn/neighbors/typedefs.pxd
sklearn/neighbors/dist_metrics.pxd
sklearn/neighbors/quad_tree.pxd
sklearn/svm/libsvm.pxd
sklearn/svm/liblinear.pxd
sklearn/tree/_tree.pxd
sklearn/tree/_criterion.pxd
sklearn/tree/_utils.pxd
sklearn/tree/_splitter.pxd
sklearn/utils/murmurhash.pxd
sklearn/utils/weight_vector.pxd
sklearn/utils/_cython_blas.pxd
sklearn/utils/_random.pxd
sklearn/utils/fast_dict.pxd
sklearn/utils/seq_dataset.pxd

with respect to

import sklearn

base_dir = sklearn.__file__

could be enough. Since in some CI we build wheels and run tests with pytest --pyargs sklearn. So that would check that these files are included in wheels. I'm not sure there is a need to check that we can cimport them in Cython?

@realead
Copy link
Author

realead commented Sep 12, 2019

@rth

I see two problems with the approach of testing only the presence of pxd-files in the installation:

  • it is possible that not all pxd-files should be exposed. So we can have a situation like now: sklearn.tree._utils.pxd is present in the installation, but cannot be cimported, because it needs pxd-files which aren't part of the installation.

  • even if all pxd-files should be in the installation, even then some of them possible could not be used /cimported (similar to the current situation with sklearn.svm.libsvm.pxd): It is possible that h-files (or even c-files or libraries) are needed in order to be able to cimport/use the pdx-files, which might not be a part of the intallation.

Because the behavior of a software tends to become what is tested and not what it is supposed to be, it is probably more robust to test that the pxd-files can be cimported, rather than that the pxd-files are present.

But obviously it is your call, what you want to have as test.

Btw, do you know, what can/should be done about failing coverage-test?

@realead realead changed the title [WIP] Include pxd-files into the installation [MRG] Include pxd-files into the installation Sep 15, 2019
@rth rth added this to the 0.22 milestone Oct 29, 2019
@rth
Copy link
Member

rth commented Oct 30, 2019

I agree that dependencies between pxd is a tricky question, but these are integration/build tests that only need to be run once before a release. Making them run as part of the unit tests suite, particularly if they need to compile something each time they are run, is problematic. The resulting .so would also add some measurable size to the built wheels.

I would still prefer,

  • either a standalone way to check this, e.g. under maint_tools
  • or some lighter tests in the test suite (that don't require compilation)

The next release is happening soon and I think it would be good to have it there. Also because there were some refactoring of the API, the paths to some of the pxd might have changed and it would be really useful to double-check that the included ones are still correct.

Could you also please merge master in?

@realead
Copy link
Author

realead commented Nov 1, 2019

@rth

I'm not sure, why this is a problem to build and also don't think that so-files will end up in the wheels (pyximport puts them into home/<user_name>/.pyxbld/lib.linux-x86_64-3.7 or whatever OS/Python version).

However, I have rebased and introduced/moved tests to a standalone tester under maint_tools.

PS: not sure what this failing test is about, seems to be a missing sh-file -hopefully my PR has nothing to do with it.

@realead
Copy link
Author

realead commented Nov 27, 2019

no longer needed.

@realead realead closed this Nov 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider a consistent policy towards including pxd-files into the installation
2 participants