Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] Fix pre-build checks to handle compilers from build_ext options #16193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jul 10, 2020

Conversation

jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Jan 24, 2020

There are (at least ?) 2 ways to specify the compiler to build the project:

  • CC=<compiler> python setup.py build_ext -i
  • python setupy.py build_ext --compiler=<compiler> -i

In the second option, <compiler> must be one of the predefined compilers in distutils or in numpy.distutils. For example, to build with icc, it's --compiler=intelem.

Currently the checks before building sklearn (compile a test program and compile a test program using openmp) ignore the compiler specified through the second option.

I also updated get_openmp_flag to better handle icc on linux (although it accepts -fopenmp, it's recommended to use -qopenmp).

I temporarily modified a ci job, Linux_Runs_pylatest_conda_mkl, to build sklearn with icc (installed through intel oneAPI) to check it works as expected. The install of icc is actually pretty fast and the job is not longer than the other ones. We could keep a job where we build with icc, not necessarily this one. wdyt ?

@oleksandr-pavlyk I'd like your feedback on this. Especially, do you use the same command to build sklearn with icc ?

@jeremiedbb
Copy link
Member Author

The doc test of SpectralCoclustering fails randomly. I don't know if it's specific to building with icc, but it occurs quite often.

@oleksandr-pavlyk
Copy link
Contributor

We use CC=icc python setup.py config_cc --compiler=intelem install.

The CC=icc is necessary because config_cc somehow does not affect build_clib's notion of compiler.

Using ICC implies that numpy.distutils's compiler options are going to be used, and it likely uses -O3.

@jeremiedbb
Copy link
Member Author

this command
python setup.py build_ext --compiler=intelem -i build_clib --compiler=intelem
allows to specify the compiler for both build_ext and build_clib without requiring to set CC.

@NicolasHug
Copy link
Member

The doc test of SpectralCoclustering fails randomly. I don't know if it's specific to building with icc, but it occurs quite often.

I don't think it's related to this PR since it also occurred in #16175

@jeremiedbb
Copy link
Member Author

Thanks for the info. I wonder if it occurs more often than with gcc because it's like in 2 out of 3 runs.

@jeremiedbb
Copy link
Member Author

Using ICC implies that numpy.distutils's compiler options are going to be used, and it likely uses -O3.

actually it's also -O3 with gcc

@@ -40,6 +40,8 @@ def get_openmp_flag(compiler):
# export LDFLAGS="$LDFLAGS -Wl,-rpath,/usr/local/opt/libomp/lib
# -L/usr/local/opt/libomp/lib -lomp"
return []
elif sys.platform == "linux" and "icc" in compiler:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this applies to sys.platform == "darwin" as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For darwin we set -openmp. Is it outdated or preferable to use -qopenmp ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also is icl something ? I remember putting icc or icl because I had the feeling that it could be a name of the icc executable but I actually never see that anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is icl on Windows.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments below.

Why would be the motivation to extend our CI to build with another compiler? To "shake" our test suite further to reveal more potential numerical stability issues?

The ICC installation overhead seems minimal when using the APT repo as done in this PR. However I am not sure what will be the long term maintenance overhead.

@jeremiedbb
Copy link
Member Author

Following the discussion in the meeting, I added a travis cron job where sklearn is built with icc instead of a regular azure job. I think it's ready to merge now.

@jeremiedbb jeremiedbb changed the title [WIP] Fix pre-build checks to handle compilers from build_ext options [MRG] Fix pre-build checks to handle compilers from build_ext options Jan 27, 2020
.travis.yml Outdated
env:
- CHECK_WARNING="true"
- BUILD_WITH_ICC="true"
if: type = cron OR commit_message =~ /\[scipy-dev\]/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use the same scipy-dev tag for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put the same name to avoid having options that we will not remember :)
But I'm ok to put a different name. icc-build ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

icc or icc-build both are fine to me.

@oleksandr-pavlyk
Copy link
Contributor

oleksandr-pavlyk commented Jan 27, 2020

When I try to run the build command, I get:

(sk-dev2) [15:14:23 vmlin scikit-learn_src]$ python setup.py build_ext --compiler=intelem -i -j 4 build_clib --compiler=intelem
Partial import of sklearn during the build process.
/localdisk/miniconda3_latest/envs/sk-dev2/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'project_urls'
  warnings.warn(msg)
/localdisk/miniconda3_latest/envs/sk-dev2/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'python_requires'
  warnings.warn(msg)
/localdisk/miniconda3_latest/envs/sk-dev2/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
C compiler: gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wformat -Wformat-security -fstack-protector-all -D_FORTIFY_SOURCE=2 -fpic -fPIC -O3 -Wformat -Wforma
t-security -fstack-protector-all -D_FORTIFY_SOURCE=2 -fpic -fPIC -O3 -fPIC

compile options: '-c'
gcc: test_program.c
gcc -pthread objects/test_program.o -o test_program
C compiler: gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wformat -Wformat-security -fstack-protector-all -D_FORTIFY_SOURCE=2 -fpic -fPIC -O3 -Wformat -Wforma
t-security -fstack-protector-all -D_FORTIFY_SOURCE=2 -fpic -fPIC -O3 -fPIC

compile options: '-c'
extra options: '-fopenmp'
gcc: test_program.c
gcc -pthread objects/test_program.o -o test_program -fopenmp
Compiling sklearn/__check_build/_check_build.pyx because it changed.
....

So it is reporting that GCC is being used to test openmp flags. This is unexpected. The only way I found to work around this is to set CC=icc.

Edit: I realized I built master branch, let me try with your branch.

@oleksandr-pavlyk
Copy link
Contributor

It behaves as advertised in the branch, however I found that

python setup.py config_cc --compiler=intelem build_ext --compiler=intelem -i -j 4 build_clib --compiler=intelem

which used to execute in master (albeit using gcc) is now throwing an error:

distutils.errors.DistutilsArgError: invalid command 'config_cc'

@jeremiedbb
Copy link
Member Author

@oleksandr-pavlyk Thanks for testing it and for the feedback.

distutils.errors.DistutilsArgError: invalid command 'config_cc'

I fixed that. But I don't think you need config_cc anymore now by specifying the compiler to build_ext and build_clib.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you please add a short section at the end of the advanced installation page in the doc to summarize the main commands to build with ICC (under Linux with the oneapi repo).

@ogrisel
Copy link
Member

ogrisel commented Feb 8, 2020

Thanks for the doc. LGTM again.

@oleksandr-pavlyk
Copy link
Contributor

BTW, please be advised that oneAPI is currently governed by a beta license, which precludes distribution of produced binaries. The license will change once oneAPI is released.

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's very valuable to test with ICC and document how to build sklearn with it.

thx @jeremiedbb

@glemaitre glemaitre merged commit cac1672 into scikit-learn:master Jul 10, 2020
@jeremiedbb jeremiedbb mentioned this pull request Jul 10, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants