-
-
Notifications
You must be signed in to change notification settings - Fork 301
Description
Topic
Opening this issue as a central point for information, discussion, etc. around the CUDA 12 migration and the new style CUDA SDK packages.
What follows is background about how CUDA builds are supported now, what is changing about them (and why), and what users can do to adopt the new best practices.
Background
As a little bit of background, the conda-forge CUDA builds up to this point have relied on a few key components:
nvidia/cuda
(here's one template image as an example)- The CUDA redistributable libraries combined together in the
cudatoolkit
package (for linkage into other packages built in conda-forge) - The
nvcc
wrapper script metapackage to glue the previous two pieces together during a build
There are of course plenty more pieces like how these components are used in the global pinnings, how CUDA migrators work, how other infra around CUDA works, how tooling configures CUDA builds (in staged-recipes
and feedstocks).
However the big picture of Docker images, CUDA libraries, and a wrapper script to bind them is the gist. This has worked ok for a while, but there are different needs it didn't satisfy.
Feedback
Based on user feedback on this model over the years, we have learned a few key things:
- The
cudatoolkit
package is bulky ( for example Is it necessary to install cudatoolkit with pyarrow 11 on Linux? arrow-cpp-feedstock#962 ) - There are other components users would like access to ( ptxas executable cudatoolkit-feedstock#72 )
- The current model has also created integration pains at multiple points
Follow-up
Based on this feedback, we started exploring a new model for handling these packages ( conda-forge/cudatoolkit-feedstock#62 ). After a bit of iterating with various stakeholders, we felt ready to start packaging all of these components ( conda-forge/staged-recipes#21382 ). This admittedly has been a fair bit of work and we appreciate everyone's patience while we have been working through it (as well as everyone who shared feedback). Though it looks like we are nearing the end of package addition and moving on to next steps.
Structure
Compiler
One particular component that has taken a fair amount of effort to structure correctly has been the compiler package. However after some final restructuring ( conda-forge/cuda-nvcc-feedstock#12 ) and some promising test results on a few CUDA-enabled feedstocks, this seems to be working in the right way. There are a few particular points of note about how the compiler package is structured (and how correspondingly other CTK packages are structured).
The compiler package is designed to be used in cross-compilation builds. As a result all headers and libraries live in a targets-based structure. This is best observed in this code from cuda-nvcc
:
[[ "@cross_target_platform@" == "linux-64" ]] && targetsDir="targets/x86_64-linux"
[[ "@cross_target_platform@" == "linux-ppc64le" ]] && targetsDir="targets/ppc64le-linux"
[[ "@cross_target_platform@" == "linux-aarch64" ]] && targetsDir="targets/sbsa-linux"
As a result $PREFIX/$targetsDir
contains the needed headers, libraries, etc. for building for that target platform. The right paths should already be picked up by nvcc
and gcc
(the host compiler). So please let us know if that is not happening for some reason.
CUDA Toolkit Packages
As noted previously, cudatoolkit
does not exist in this new model (starting with CUDA 12.0). Also no libraries are pulled in outside of cudart
(in particular the static library). So other libraries like libcublas
, libcurand
, etc. need to explicitly be added to requirements/host
for CUDA 12. In particular there are *-dev
libraries (like libcublas-dev
) that would need to be added to host
. The -dev
package contains both headers and shared libraries. Also the -dev
packages add run_exports
for their corresponding runtime library; so, that dependency will be satisfied automatically using the usually conda-build behavior.
Building off the compiler section above, the move to support cross-compilation means that packages supply headers, libraries, etc. under a different path than $PREFIX
. Instead the paths look something like this (using $targetsDir
from above):
- Headers:
$PREFIX/$targetsDir/include
- Libraries:
$PREFIX/$targetsDir/lib
- Stub libraries (if any):
$PREFIX/$targetsDir/lib/stubs
Note that these paths shouldn't be needed anywhere since nvcc
and gcc
know about them (when {{ compiler("cuda") }}
is added to requirements/build
). Though if some issues come up related to this, please reach out.
CUDA version alignment
Previously the cudatoolkit
package had served a CUDA version alignment function ( #687 ). Namely installing cudatoolkit=11.2
would ensure all packages supported CUDA 11.2. This could be useful when trying to satisfy some hardware constraints.
Another issue the cudatoolkit
package solved, which may be less obvious at the outset, is it provides a consistent set of libraries that are all part of the same CUDA release (say 11.8 for example). This is trivially solved with one package. However splitting these out into multiple packages (and adding more packages as well), makes this problem more complicated. The cuda-version
package also comes in here by aligning all CUDA packages to a particular CUDA version.
Finally as there is a fissure between the CUDA 11 & 12 worlds in terms of how these CUDA packages are handled, the cuda-version
mends this gap. In particular cuda-version
constrains cudatoolkit
and cuda-version
is backported to all CUDA version conda-forge has shipped. As a result cuda-version
can be used in both CUDA 11 & 12 use cases to smooth over some of these differences and standardize CUDA version handling.
More details about the cuda-version
package can be found in this README.
Final Observations
It's worth noting this new structure differs from notably from the current one in conda-forge, where we have Docker images that place the full CUDA Toolkit install in /usr/local/cuda
and runtime libraries added automatically. This is done to both address use cases where only cudart
or nvcc
are needed (with no additional dependencies) as well as use cases where extra features are needed. Trying to address both of these means greater ability to fine-tune use cases, but it also means a bit more work for maintainers. Hoping this provides more context around why these changes are occurring and what they mean for maintainers and users.
Next Steps
Recently the CUDA 12 migrator was merged ( conda-forge/conda-forge-pinning-feedstock#4400 ). Already a handful of CUDA 12 migrator PRs have gone out.
For feedstocks that statically link cudart
(default behavior) and don't use other libraries, there may be nothing additional to do.
For feedstocks that depend on other CUDA Toolkit components (say cuBLAS and cuRAND), those can be added to host
with a selector like so...
diff --git a/recipe/meta.yaml b/recipe/meta.yaml
--- a/recipe/meta.yaml
+++ b/recipe/meta.yaml
@@ -2,2 +2,5 @@
host:
- python
- pip
- numpy
+ - libcublas-dev # [(cuda_compiler_version or "").startswith("12")]
+ - libcurand-dev # [(cuda_compiler_version or "").startswith("12")]
Similar patterns can be followed for other dependencies. Also please feel free to reach out for help here if needed
On CUDA 12.x
Initially we just added CUDA 12.0.0. We now are adding CUDA 12.x releases (with x > 0
). In particular we have added CUDA 12.1 ( conda-forge/cuda-feedstock#11 ) and are working on CUDA 12.2 ( conda-forge/cuda-feedstock#13 )
If users want to pin tightly to a specific CUDA 12.x version during the build, would recommend adding cuda-version
in the relevant requirements
sections. For example to constrain during the build (though not necessarily at runtime)...
requirements:
build:
...
- {{ compiler('cuda') }} # [(cuda_compiler or "None") != "None"]
host:
...
- cuda-version {{ cuda_compiler_version }} # [(cuda_compiler_version or "None") != "None"]
run:
...
Thanks all! 🙏