| orphan | true |
|---|
- Documentation Development
Before building or serving the documentation, set up the docs environment using the Makefile:
make docs-env
source .venv-docs/bin/activateThis will create a virtual environment in .venv-docs and install all required dependencies for building the documentation.
To build the NeMo Curator documentation, run:
make docs-html- The resulting HTML files are generated in a
_build/htmlfolder under the projectdocs/folder. - The generated Python API docs are placed in
apidocsunder thedocs/folder.
The documentation supports different build variants:
make docs-html- Default build (includes all content)make docs-html-ga- GA (General Availability) build (excludes EA-only content)make docs-html-ea- EA (Early Access) build (includes all content)
To serve the documentation with live updates as you edit, run:
make docs-liveOpen a web browser and go to http://localhost:8000 (or the port shown in your terminal) to view the output.
The documentation system supports three ways to conditionally include/exclude content based on build tags (e.g., GA vs EA builds). All methods use the unified :only: syntax.
Use frontmatter to exclude entire documents from specific builds:
---
only: not ga
---
# This entire document will be excluded from GA buildsSupported conditions:
only: not ga- Exclude from GA builds (EA-only content)only: ga- Include only in GA buildsonly: not ea- Exclude from EA buildsonly: internal- Include only in internal builds
Directory inheritance: If a parent directory's index.md has an only condition, all child documents inherit it automatically.
Hide specific grid cards from certain builds:
:::{grid-item-card} Video Curation Features
:link: video-overview
:link-type: ref
:only: not ga
Content for EA-only features
+++
{bdg-secondary}`early-access`
:::Control navigation entries conditionally:
# Global toctree condition (hides entire section)
::::{toctree}
:hidden:
:caption: Early Access Features
:only: not ga
ea-feature1.md
ea-feature2.md
::::
# Inline entry conditions (hides individual entries)
::::{toctree}
:hidden:
:caption: Documentation
standard-doc.md
ea-only-doc.md :only: not ga
another-standard-doc.md
::::- Use file-level exclusion for entire documentation sections (better performance, no warnings)
- Use grid/toctree conditions for fine-grained control within shared documents
- Be consistent with condition syntax across all methods
- Test both build variants to ensure content appears/disappears correctly
# Test default build (includes all content)
make docs-html
# Test GA build (excludes EA-only content)
make docs-html-ga
# Verify content is properly hidden/shown in each buildSphinx is configured to support running doctests in both Python docstrings and in Markdown code blocks with the {doctest} directive. However, as of now, there are no real doctest examples in the codebase—only the example below in this README. If you add doctest examples, you can run them manually with:
source .venv-docs/bin/activate
cd docs
sphinx-build -b doctest . _build/doctestThere is currently no Makefile target for running doctests; you must use the above command directly.
Any code in triple backtick blocks with the {doctest} directive will be tested if you add real examples. The format follows Python's doctest module syntax, where >>> indicates Python input and the following line shows the expected output. Here's an example:
def add(x: int, y: int) -> int:
"""
Adds two integers together.
Args:
x (int): The first integer to add.
y (int): The second integer to add.
Returns:
int: The sum of x and y.
Examples:
```{doctest}
>>> add(1, 2)
3
```
"""
return x + yThe documentation uses a custom Sphinx extension (myst_codeblock_substitutions) that enables MyST substitutions to work inside standard code blocks. This allows you to maintain consistent variables (like version numbers, URLs, product names) across your documentation while preserving template syntax in YAML and other template languages.
The extension is configured in docs/conf.py:
# Add the extension
extensions = [
# ... other extensions
"myst_codeblock_substitutions", # Our custom MyST substitutions in code blocks
]
# Define reusable variables
myst_substitutions = {
"product_name": "NeMo Curator",
"product_name_short": "Curator",
"company": "NVIDIA",
"version": release, # Uses the release variable from conf.py
"current_year": "2025",
"github_repo": "https://github.com/NVIDIA/NeMo-Curator",
"docs_url": "https://docs.nvidia.com/nemo-curator",
"support_email": "[email protected]",
"min_python_version": "3.8",
"recommended_cuda": "12.0+",
}Use {{ variable }} syntax in regular markdown text:
Welcome to {{ product_name }} version {{ version }}!
{{ product_name_short }} is developed by {{ company }}.
For support, contact {{ support_email }}.The extension enables substitutions in standard code blocks:
# Install {{ product_name }} version {{ version }}
helm install my-release oci://nvcr.io/nvidia/nemo-curator --version {{ version }}
kubectl get pods -n {{ product_name_short }}-namespace
docker run --rm nvcr.io/nvidia/nemo-curator:{{ version }}
pip install nemo-curator=={{ version }}The extension intelligently protects template languages from unwanted substitutions:
These languages are treated carefully to preserve their native {{ }} syntax:
yaml,yml(Kubernetes, Docker Compose)helm,gotmpl,go-template(Helm charts)jinja,jinja2,j2(Ansible, Python templates)ansible(Ansible playbooks)handlebars,hbs,mustache(JavaScript templates)django,twig,liquid,smarty(Web framework templates)
The extension automatically detects and preserves common template patterns:
{{ .Values.something }}(Helm values){{ ansible_variable }}(Ansible variables){{ item.property }}(Template loops){{- variable }}(Whitespace control){{ range ... }},{{ if ... }}(Control structures)
# values.yaml - MyST substitutions work alongside Helm templates
image:
repository: nvcr.io/nvidia/nemo-curator
tag: {{ .Values.image.tag | default "latest" }} # ← Helm template (preserved)
# Documentation URLs using MyST substitutions
downloads:
releaseUrl: "https://github.com/NVIDIA/NeMo-Curator/releases/download/v{{ version }}/nemo-curator.tar.gz" # ← MyST substitution
docsUrl: "{{ docs_url }}" # ← MyST substitution
supportEmail: "{{ support_email }}" # ← MyST substitution
service:
name: {{ include "nemo-curator.fullname" . }} # ← Helm template (preserved)
env:
- name: CURATOR_VERSION
value: "{{ .Chart.AppVersion }}" # ← Helm template (preserved)
- name: DOCS_VERSION
value: "{{ version }}" # ← MyST substitution# MyST substitutions for documentation
- name: "Install {{ product_name }} version {{ version }}" # ← MyST substitution
shell: |
wget {{ github_repo }}/releases/download/v{{ version }}/nemo-curator.tar.gz # ← MyST substitution
# Ansible templates preserved
when: "{{ ansible_distribution }} == 'Ubuntu'" # ← Ansible template (preserved)
notify: "{{ handlers.restart_service }}" # ← Ansible template (preserved)- Single Source of Truth: Update version numbers, URLs, and product names in one place (
conf.py) - Template Safety: Won't break existing Helm charts, Ansible playbooks, or other templates
- Intelligent Processing: Only processes simple variable names, preserves complex template syntax
- Cross-Format Support: Works in bash, python, dockerfile, and other code blocks
- Maintainability: Reduces copy-paste errors and keeps documentation in sync with releases
The extension automatically handles the complexity of mixed template syntax, so you can focus on writing great documentation without worrying about breaking existing templates.