Thanks to visit codestin.com
Credit goes to github.com

Skip to content

docs: replace core-components with architecture page#1323

Open
cwing-nvidia wants to merge 3 commits into
mainfrom
cwing/docs-core-components-refactor
Open

docs: replace core-components with architecture page#1323
cwing-nvidia wants to merge 3 commits into
mainfrom
cwing/docs-core-components-refactor

Conversation

@cwing-nvidia
Copy link
Copy Markdown
Contributor

Summary

  • Replaces the "Environment Components" page (core-components.mdx) with a new "Architecture" page that bridges conceptual components with NeMo Gym's server implementation.
  • Clarifies composability and flexibility: bring your own agent, use Gym's harnesses, or train with any model endpoint.
  • Adds a task execution flow section ("How an Agent Runs a Task") with a diagram showing dataset → agent → model/resources server interactions.
  • Explains where tools live (agent-side vs environment-side) and how state isolation works (from stateless math verification to full Docker containers).
  • Updates terminology to be inclusive of external agents and models throughout.

Builds on the concept pages added in #1265.

Test plan

  • Preview docs locally with fern dev and verify the Architecture page renders correctly
  • Confirm nav auto-discovery picks up architecture.mdx and no longer shows core-components
  • Check all internal links resolve (Concepts, Browse Environments, Agents, Training, Build Custom Environments)
  • Review diagram renders correctly in the docs site

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 14, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cwing-nvidia cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from d2251b3 to 472627d Compare May 14, 2026 03:20

Each task attempt flows through three steps. The resulting trajectory is called a *rollout*:

1. **Initialize** — The agent receives a task row from the dataset and initializes a session on the Resources Server, which sets up isolated state for this task.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initializes a session on the Resources Server

Not always, right? eg SWE RL

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should always be initialized by the resources server. i opened a separate issue to track decoupling for SWE here #1249


## Where Tools Live

Tools exist on a spectrum — some belong to the agent, some belong to the environment:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Tools exist on a spectrum — some belong to the agent, some belong to the environment:
Tools can be implemented in an agent server, but it is useful to decouple them with a resources server for reuse and scalability:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users with existing agents will already have tools in their agent, and they should be able to use Gym without modifying their agent. Let me think on how to make this clearer in the wording

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the initial wording. Many users will already have tools in their agent.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kept the bullet wording, updated the intro sentence to provide additional clarity

Tools exist on a spectrum — some belong to the agent, some belong to the environment:

- **Agent-side tools** are part of the agent harness. They're capabilities the agent brings regardless of which environment it runs in (e.g., OpenHands brings file editing and terminal tools).
- **Environment-side tools** are part of the Resources Server. They're capabilities the environment provides to any agent that connects (e.g., a `run_tests` endpoint, a database query tool, a sandbox execution API).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this separation of agent and environment feels a bit confusing given that we say environment contains agent, then here it sounds like it doesnt

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree - let me iterate on the wording

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to environment-specific and agent-specific

|---------|---------------|
| Dataset | JSONL: Responses API input — each row is a task for the agent |
| Agent Harness | FastAPI Agent Server or your own agent via HTTP |
| Verifier + State | FastAPI Resources Server |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also add environment tools here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question, i originally had it, but then felt that would be confusing because i dont have that as a core concept in the Environments concepts page. let me think on this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first was thinking to put it in, but now I lean towards leaving it out. This is because tools can be in either the agent harness or in the state (resources server). It's too confusing to introduce here and it's better down below expanded upon.

Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated
| Dataset | JSONL: Responses API input — each row is a task for the agent |
| Agent Harness | FastAPI Agent Server or your own agent via HTTP |
| Verifier + State | FastAPI Resources Server |
| Model | FastAPI Model Server or managed by your own agent |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand what "managed by your own agent" means.
We should be explicit (maybe) that you can use both hosted models and local models, perhaps? Unless it's obvious to user.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. You probably mean if the model is TIED to the agent harness.
Claude code, for example, will let you bring your own endpoint or it can be a default out of the box.

Nevermind, I think this is good.

Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated

### Agent Server

Hosts built-in agent harnesses, including NeMo Gym native harnesses and integrated open-source harnesses like OpenHands.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closed source harnesses will also be supported. Claude code, for example.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped 'open source'. once we have claude code we could add it explicitly

Manages the verifier, state, and environment-specific tools:

- **Environment Tools** — environment-specific capabilities available to any agent (e.g., code execution, database queries, API calls)
- **State** — isolated per rollout via session IDs. Some environments are stateless (e.g., math verification), while others provision full container environments (e.g., a Docker container with a specific repo checkout for SWE-Bench-style tasks).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stateless from task to task? Not sure if right place to address. Like does stateless mean that each task has a fresh state or something else?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait building on this it sounds like yes, it's isolated per single task row in the jsonl.
I don't understand how provisioning a container environment is related to statelessness. You can still be stateless and have a container environment. What are we trying to say here? Kinda open q

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the language to clarify

Copy link
Copy Markdown
Contributor

@hwolff99 hwolff99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing work!

@cwing-nvidia cwing-nvidia marked this pull request as ready for review May 18, 2026 18:48
@cwing-nvidia cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch 2 times, most recently from 69f5161 to 888d2e1 Compare May 18, 2026 18:55
Copy link
Copy Markdown
Contributor

@ananthsub ananthsub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these definitions make the architecture much clearer!

Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated
Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated
@cwing-nvidia cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from 5b8df16 to 98cad74 Compare May 18, 2026 21:28
Refactor the "Environment Components" page into an "Architecture" page
that bridges conceptual components (from the environments concept page)
with NeMo Gym's server-based implementation. Clarifies composability,
external agent/model support, tool ownership, and the task execution flow.

Signed-off-by: Chris Wing <[email protected]>
Address reviewer feedback: replace ambiguous "stateless" framing with
an isolation spectrum, and use consistent "environment-specific tools" /
"agent-specific tools" naming throughout.

Signed-off-by: Chris Wing <[email protected]>
@cwing-nvidia cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from 98cad74 to caf87ad Compare May 18, 2026 21:30
Co-authored-by: Ananth Subramaniam <[email protected]>
Signed-off-by: Chris Wing <[email protected]>
@cwing-nvidia cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from caf87ad to 6cdb6d8 Compare May 18, 2026 21:32
@cwing-nvidia cwing-nvidia requested a review from ananthsub May 18, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants