Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit a9950b1

Browse files
authored
[README] Add a list of install options (#1492)
* add a list of install options Signed-off-by: Masaki Kozuki <[email protected]> * verbose about `fused_layer_norm_cuda` and `fast_layer_norm` Signed-off-by: Masaki Kozuki <[email protected]>
1 parent 6a40a0a commit a9950b1

2 files changed

Lines changed: 52 additions & 16 deletions

File tree

README.md

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,8 @@ amp.load_state_dict(checkpoint['amp'])
103103
Note that we recommend restoring the model using the same `opt_level`. Also note that we recommend calling the `load_state_dict` methods after `amp.initialize`.
104104

105105
# Installation
106+
Each [`apex.contrib`](./apex/contrib) module requires one or more install options other than `--cpp_ext` and `--cuda_ext`.
107+
Note that contrib modules do not necessarily support stable PyTorch releases.
106108

107109
## Containers
108110
NVIDIA PyTorch Containers are available on NGC: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch.
@@ -128,7 +130,7 @@ cd apex
128130
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
129131
```
130132

131-
Apex also supports a Python-only build via
133+
APEX also supports a Python-only build via
132134
```bash
133135
pip install -v --disable-pip-version-check --no-cache-dir ./
134136
```
@@ -144,3 +146,37 @@ A Python-only build omits:
144146
`pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .` may work if you were able to build Pytorch from source
145147
on your system. A Python-only build via `pip install -v --no-cache-dir .` is more likely to work.
146148
If you installed Pytorch in a Conda environment, make sure to install Apex in that same environment.
149+
150+
151+
## Custom C++/CUDA Extensions and Install Options
152+
153+
If a requirement of a module is not met, then it will not be built.
154+
155+
| Module Name | Install Option | Misc |
156+
|---------------|------------------|--------|
157+
| `apex_C` | `--cpp_ext` | |
158+
| `amp_C` | `--cuda_ext` | |
159+
| `syncbn` | `--cuda_ext` | |
160+
| `fused_layer_norm_cuda` | `--cuda_ext` | [`apex.normalization`](./apex/normalization) |
161+
| `mlp_cuda` | `--cuda_ext` | |
162+
| `scaled_upper_triang_masked_softmax_cuda` | `--cuda_ext` | |
163+
| `generic_scaled_masked_softmax_cuda` | `--cuda_ext` | |
164+
| `scaled_masked_softmax_cuda` | `--cuda_ext` | |
165+
| `fused_weight_gradient_mlp_cuda` | `--cuda_ext` | Requires CUDA>=11 |
166+
| `permutation_search_cuda` | `--permutation_search` | [`apex.contrib.sparsity`](./apex/contrib/sparsity) |
167+
| `bnp` | `--bnp` | [`apex.contrib.groupbn`](./apex/contrib/groupbn) |
168+
| `xentropy` | `--xentropy` | [`apex.contrib.xentropy`](./apex/contrib/xentropy) |
169+
| `focal_loss_cuda` | `--focal_loss` | [`apex.contrib.focal_loss`](./apex/contrib/focal_loss) |
170+
| `fused_index_mul_2d` | `--index_mul_2d` | [`apex.contrib.index_mul_2d`](./apex/contrib/index_mul_2d) |
171+
| `fused_adam_cuda` | `--deprecated_fused_adam` | [`apex.contrib.optimizers`](./apex/contrib/optimizers) |
172+
| `fused_lamb_cuda` | `--deprecated_fused_lamb` | [`apex.contrib.optimizers`](./apex/contrib/optimizers) |
173+
| `fast_layer_norm` | `--fast_layer_norm` | [`apex.contrib.layer_norm`](./apex/contrib/layer_norm). different from `fused_layer_norm` |
174+
| `fmhalib` | `--fmha` | [`apex.contrib.fmha`](./apex/contrib/fmha) |
175+
| `fast_multihead_attn` | `--fast_multihead_attn` | [`apex.contrib.multihead_attn`](./apex/contrib/multihead_attn) |
176+
| `transducer_joint_cuda` | `--transducer` | [`apex.contrib.transducer`](./apex/contrib/transducer) |
177+
| `transducer_loss_cuda` | `--transducer` | [`apex.contrib.transducer`](./apex/contrib/transducer) |
178+
| `cudnn_gbn_lib` | `--cudnn_gbn` | Requires cuDNN>=8.5, [`apex.contrib.cudnn_gbn`](./apex/contrib/cudnn_gbn) |
179+
| `peer_memory_cuda` | `--peer_memory` | [`apex.contrib.peer_memory`](./apex/contrib/peer_memory) |
180+
| `nccl_p2p_cuda` | `--nccl_p2p` | Requires NCCL >= 2.10, [`apex.contrib.nccl_p2p`](./apex/contrib/nccl_p2p) |
181+
| `fast_bottleneck` | `--fast_bottleneck` | Requires `peer_memory_cuda` and `nccl_p2p_cuda`, [`apex.contrib.bottleneck`](./apex/contrib/bottleneck) |
182+
| `fused_conv_bias_relu` | `--fused_conv_bias_relu` | Requires cuDNN>=8.4, [`apex.contrib.conv_bias_relu`](./apex/contrib/conv_bias_relu) |

setup.py

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -686,21 +686,6 @@ def check_cudnn_version_and_warn(global_option: str, required_cudnn_version: int
686686
)
687687
)
688688

689-
# note (mkozuki): Now `--fast_bottleneck` option (i.e. apex/contrib/bottleneck) depends on `--peer_memory` and `--nccl_p2p`.
690-
if "--fast_bottleneck" in sys.argv:
691-
sys.argv.remove("--fast_bottleneck")
692-
raise_if_cuda_home_none("--fast_bottleneck")
693-
if check_cudnn_version_and_warn("--fast_bottleneck", 8400):
694-
subprocess.run(["git", "submodule", "update", "--init", "apex/contrib/csrc/cudnn-frontend/"])
695-
ext_modules.append(
696-
CUDAExtension(
697-
name="fast_bottleneck",
698-
sources=["apex/contrib/csrc/bottleneck/bottleneck.cpp"],
699-
include_dirs=[os.path.join(this_dir, "apex/contrib/csrc/cudnn-frontend/include")],
700-
extra_compile_args={"cxx": ["-O3"] + version_dependent_macros + generator_flag},
701-
)
702-
)
703-
704689
if "--cudnn_gbn" in sys.argv:
705690
sys.argv.remove("--cudnn_gbn")
706691
raise_if_cuda_home_none("--cudnn_gbn")
@@ -759,6 +744,21 @@ def check_cudnn_version_and_warn(global_option: str, required_cudnn_version: int
759744
f"Skip `--nccl_p2p` as it requires NCCL 2.10.3 or later, but {_available_nccl_version[0]}.{_available_nccl_version[1]}"
760745
)
761746

747+
# note (mkozuki): Now `--fast_bottleneck` option (i.e. apex/contrib/bottleneck) depends on `--peer_memory` and `--nccl_p2p`.
748+
if "--fast_bottleneck" in sys.argv:
749+
sys.argv.remove("--fast_bottleneck")
750+
raise_if_cuda_home_none("--fast_bottleneck")
751+
if check_cudnn_version_and_warn("--fast_bottleneck", 8400):
752+
subprocess.run(["git", "submodule", "update", "--init", "apex/contrib/csrc/cudnn-frontend/"])
753+
ext_modules.append(
754+
CUDAExtension(
755+
name="fast_bottleneck",
756+
sources=["apex/contrib/csrc/bottleneck/bottleneck.cpp"],
757+
include_dirs=[os.path.join(this_dir, "apex/contrib/csrc/cudnn-frontend/include")],
758+
extra_compile_args={"cxx": ["-O3"] + version_dependent_macros + generator_flag},
759+
)
760+
)
761+
762762

763763
if "--fused_conv_bias_relu" in sys.argv:
764764
sys.argv.remove("--fused_conv_bias_relu")

0 commit comments

Comments
 (0)