Codestin Search App

cwing-nvidia · 2026-05-14T03:16:02Z

Summary

Replaces the "Environment Components" page (core-components.mdx) with a new "Architecture" page that bridges conceptual components with NeMo Gym's server implementation.
Clarifies composability and flexibility: bring your own agent, use Gym's harnesses, or train with any model endpoint.
Adds a task execution flow section ("How an Agent Runs a Task") with a diagram showing dataset → agent → model/resources server interactions.
Explains where tools live (agent-side vs environment-side) and how state isolation works (from stateless math verification to full Docker containers).
Updates terminology to be inclusive of external agents and models throughout.

Builds on the concept pages added in #1265.

Test plan

Preview docs locally with fern dev and verify the Architecture page renders correctly
Confirm nav auto-discovery picks up architecture.mdx and no longer shows core-components
Check all internal links resolve (Concepts, Browse Environments, Agents, Training, Build Custom Environments)
Review diagram renders correctly in the docs site

copy-pr-bot · 2026-05-14T03:16:06Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

cmunley1 · 2026-05-14T03:33:34Z

+
+Each task attempt flows through three steps. The resulting trajectory is called a *rollout*:
+
+1. **Initialize** — The agent receives a task row from the dataset and initializes a session on the Resources Server, which sets up isolated state for this task.


initializes a session on the Resources Server

Not always, right? eg SWE RL

I think it should always be initialized by the resources server. i opened a separate issue to track decoupling for SWE here #1249

cmunley1 · 2026-05-14T03:34:54Z

+
+## Where Tools Live
+
+Tools exist on a spectrum — some belong to the agent, some belong to the environment:


Suggested change

Tools exist on a spectrum — some belong to the agent, some belong to the environment:

Tools can be implemented in an agent server, but it is useful to decouple them with a resources server for reuse and scalability:

Users with existing agents will already have tools in their agent, and they should be able to use Gym without modifying their agent. Let me think on how to make this clearer in the wording

I prefer the initial wording. Many users will already have tools in their agent.

kept the bullet wording, updated the intro sentence to provide additional clarity

cmunley1 · 2026-05-14T03:35:29Z

+Tools exist on a spectrum — some belong to the agent, some belong to the environment:
+
+- **Agent-side tools** are part of the agent harness. They're capabilities the agent brings regardless of which environment it runs in (e.g., OpenHands brings file editing and terminal tools).
+- **Environment-side tools** are part of the Resources Server. They're capabilities the environment provides to any agent that connects (e.g., a `run_tests` endpoint, a database query tool, a sandbox execution API).


this separation of agent and environment feels a bit confusing given that we say environment contains agent, then here it sounds like it doesnt

agree - let me iterate on the wording

renamed to environment-specific and agent-specific

AmeliaYe · 2026-05-14T16:17:17Z

+|---------|---------------|
+| Dataset | JSONL: Responses API input — each row is a task for the agent |
+| Agent Harness | FastAPI Agent Server or your own agent via HTTP |
+| Verifier + State | FastAPI Resources Server |


Do we also add environment tools here?

good question, i originally had it, but then felt that would be confusing because i dont have that as a core concept in the Environments concepts page. let me think on this

I first was thinking to put it in, but now I lean towards leaving it out. This is because tools can be in either the agent harness or in the state (resources server). It's too confusing to introduce here and it's better down below expanded upon.

hwolff99 · 2026-05-15T21:03:03Z

+| Dataset | JSONL: Responses API input — each row is a task for the agent |
+| Agent Harness | FastAPI Agent Server or your own agent via HTTP |
+| Verifier + State | FastAPI Resources Server |
+| Model | FastAPI Model Server or managed by your own agent |


Don't understand what "managed by your own agent" means.
We should be explicit (maybe) that you can use both hosted models and local models, perhaps? Unless it's obvious to user.

Ah, I see. You probably mean if the model is TIED to the agent harness.
Claude code, for example, will let you bring your own endpoint or it can be a default out of the box.

Nevermind, I think this is good.

hwolff99 · 2026-05-15T21:19:55Z

+
+### Agent Server
+
+Hosts built-in agent harnesses, including NeMo Gym native harnesses and integrated open-source harnesses like OpenHands.


Closed source harnesses will also be supported. Claude code, for example.

I dropped 'open source'. once we have claude code we could add it explicitly

hwolff99 · 2026-05-15T21:21:36Z

+Manages the verifier, state, and environment-specific tools:
+
+- **Environment Tools** — environment-specific capabilities available to any agent (e.g., code execution, database queries, API calls)
+- **State** — isolated per rollout via session IDs. Some environments are stateless (e.g., math verification), while others provision full container environments (e.g., a Docker container with a specific repo checkout for SWE-Bench-style tasks).


stateless from task to task? Not sure if right place to address. Like does stateless mean that each task has a fresh state or something else?

wait building on this it sounds like yes, it's isolated per single task row in the jsonl.
I don't understand how provisioning a container environment is related to statelessness. You can still be stateless and have a container environment. What are we trying to say here? Kinda open q

updated the language to clarify

hwolff99

amazing work!

ananthsub

these definitions make the architecture much clearer!

Refactor the "Environment Components" page into an "Architecture" page that bridges conceptual components (from the environments concept page) with NeMo Gym's server-based implementation. Clarifies composability, external agent/model support, tool ownership, and the task execution flow. Signed-off-by: Chris Wing <[email protected]>

Address reviewer feedback: replace ambiguous "stateless" framing with an isolation spectrum, and use consistent "environment-specific tools" / "agent-specific tools" naming throughout. Signed-off-by: Chris Wing <[email protected]>

Co-authored-by: Ananth Subramaniam <[email protected]> Signed-off-by: Chris Wing <[email protected]>

cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from d2251b3 to 472627d Compare May 14, 2026 03:20

cmunley1 reviewed May 14, 2026

View reviewed changes

AmeliaYe reviewed May 14, 2026

View reviewed changes

hwolff99 reviewed May 15, 2026

View reviewed changes

Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated

hwolff99 reviewed May 15, 2026

View reviewed changes

Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated

hwolff99 reviewed May 15, 2026

View reviewed changes

cwing-nvidia marked this pull request as ready for review May 18, 2026 18:48

cwing-nvidia requested review from AmeliaYe, ananthsub, cmunley1 and hwolff99 May 18, 2026 18:49

copy-pr-bot Bot temporarily deployed to public May 18, 2026 18:49 Inactive

copy-pr-bot Bot temporarily deployed to public May 18, 2026 18:50 Inactive

cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch 2 times, most recently from 69f5161 to 888d2e1 Compare May 18, 2026 18:55

ananthsub reviewed May 18, 2026

View reviewed changes

Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated

Comment thread fern/versions/latest/pages/about/architecture.mdx Outdated

cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from 5b8df16 to 98cad74 Compare May 18, 2026 21:28

cwing-nvidia added 2 commits May 18, 2026 14:29

cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from 98cad74 to caf87ad Compare May 18, 2026 21:30

Update fern/versions/latest/pages/about/architecture.mdx

6cdb6d8

Co-authored-by: Ananth Subramaniam <[email protected]> Signed-off-by: Chris Wing <[email protected]>

cwing-nvidia force-pushed the cwing/docs-core-components-refactor branch from caf87ad to 6cdb6d8 Compare May 18, 2026 21:32

cwing-nvidia requested a review from ananthsub May 18, 2026 21:33

ananthsub approved these changes May 18, 2026

View reviewed changes


		Each task attempt flows through three steps. The resulting trajectory is called a rollout:

		1. Initialize — The agent receives a task row from the dataset and initializes a session on the Resources Server, which sets up isolated state for this task.


		## Where Tools Live

		Tools exist on a spectrum — some belong to the agent, some belong to the environment:

	Tools exist on a spectrum — some belong to the agent, some belong to the environment:
	Tools can be implemented in an agent server, but it is useful to decouple them with a resources server for reuse and scalability:


		### Agent Server

		Hosts built-in agent harnesses, including NeMo Gym native harnesses and integrated open-source harnesses like OpenHands.

Conversation

cwing-nvidia commented May 14, 2026

Summary

Test plan

Uh oh!

copy-pr-bot Bot commented May 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hwolff99 left a comment

Choose a reason for hiding this comment

Uh oh!

ananthsub left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants