Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat(agent/agentcontainers): retry with longer name on failure #18513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 24, 2025

Conversation

DanielleMaywood
Copy link
Contributor

@DanielleMaywood DanielleMaywood commented Jun 23, 2025

Closes coder/internal#732

We now try (up to 5 times) when attempting to create an agent using the workspace folder as the name.

It is important to note this flow is only ever ran when attempting to create an agent using the workspace folder as the name. If a deployment uses terraform or the devcontainer customization, we do not fall back to this approach.

Screenshot 2025-06-24 at 09 48 08

Closes coder/internal#732

We now try (up to 5 times) when attempting to create an agent using the
workspace folder as the name.

It is important to note this flow is only ever ran when attempting to
create an agent using the workspace folder as the name. If a deployment
uses terraform or the devcontainer customization, we do not fall back to
this approach.
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a retry mechanism for sub-agent creation when using the workspace-folder-derived name fails due to unique constraint violations, and covers the new behavior with tests.

  • Implement retry loop (up to 5 attempts) with increasingly expanded names on unique‐constraint errors.
  • Track when the workspace-folder name is used (usingWorkspaceFolderName) and fall back to expandedAgentName.
  • Extend the fake client and add tests (TestSubAgentCreationWithNameRetry and TestExpandedAgentName) to validate collision handling.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
agent/agentcontainers/api.go Add usingWorkspaceFolderName, retry loop with expandedAgentName, and PQ error handling
agent/agentcontainers/api_test.go Enhance fakeSubAgentClient for conflict simulation and add retry tests
agent/agentcontainers/api_internal_test.go Add tests for expandedAgentName logic at various depths and edge cases
Comments suppressed due to low confidence (1)

agent/agentcontainers/api.go:22

  • The retry code uses errors.As, but the standard "errors" package is not imported, causing a compile error. Add import "errors".
	"github.com/lib/pq"

@DanielleMaywood DanielleMaywood marked this pull request as ready for review June 24, 2025 09:04
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested some changes to state handling and finalizing the name if creation succeeds (assuming it's not the containers name).

@@ -591,6 +593,7 @@ func (api *API) processUpdatedContainersLocked(ctx context.Context, updated code
// agent name based off of the folder name (i.e. no valid characters),
// we will instead fall back to using the container's friendly name.
dc.Name = safeAgentName(path.Base(filepath.ToSlash(dc.WorkspaceFolder)), dc.Container.FriendlyName)
api.usingWorkspaceFolderName[dc.WorkspaceFolder] = struct{}{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check if dc.Name != dc.Container.FriendlyName? Also, also good to ensure there's no conflict with api.devcontainerNames so that we don't accidentally create this agent before the predetermined one and then trigger a reverse conflict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yep good point. Will do 👍

recreateSuccessTimes map[string]time.Time // By workspace folder.
recreateErrorTimes map[string]time.Time // By workspace folder.
injectedSubAgentProcs map[string]subAgentProcess // By workspace folder.
usingWorkspaceFolderName map[string]struct{} // By workspace folder.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: While struct{} works and takes up less space, using a boolean would be a bit cleaner IMO, but up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to make that change 👍

// We increase how much of the workspace folder is used for generating
// the agent name. With each iteration there is greater chance of this
// being successful.
subAgentConfig.Name = expandedAgentName(dc.WorkspaceFolder, dc.Container.FriendlyName, attempt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this isn't "container friendly name", and the creation succeeds on next iteration, we should write it to api.devcontainerNames and avoid re-evaluating it again so it no longer changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

//
// We only care if sub agent creation has failed due to a unique constraint
// violation on the agent name, as we can _possibly_ rectify this.
if !strings.Contains(err.Error(), "workspace agent name") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors.As is a bit tricky but typically works if you have a pointer to the type like:

myErr := &someError{}
// or maybe: var myErr *someError
if errors.As(err, &myErr) {
  // ...
}

It's very possible I got something wrong there as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what I was trying. I'll try again because I'm not happy with the solution I ended up with.

I'm not entirely sure why it didn't work but it could possibly be because it is wrapped in xerrors.Errorf and then transported over the wire via drpc. Is it possible that in the serialization and deserialization process some of that information was lost?

if err != nil {
return nil, xerrors.Errorf("insert sub agent: %w", err)
}


for attempt := 1; attempt <= maxAttempts; attempt++ {
if proc.agent, err = client.Create(ctx, subAgentConfig); err == nil {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can delete from usingWorkspaceFolder here?

originalName := subAgentConfig.Name
maxAttempts := 5

for attempt := 1; attempt <= maxAttempts; attempt++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know all sub-agent names for the workspace at this point? If so, could we not do this check in-memory before trying the creation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of. We know all sub-agent names for this agent. There could be another parent agent that creates dev containers (not really sure why you would but it is possible).

I'm happy to update the logic slightly to first create a known unique name for this parent agent, and then fallback to adding more context again?

Copy link
Member

@johnstcn johnstcn Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most workspaces are going to only have a single top-level agent. It makes sense to avoid this round trip if we can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I'll try to update the logic to do the best it can before hand (and keep this fallback).

This logic is already a fallback from the assumed default of only 1 devcontainer per workspace (and span up with devcontainer up instead of defined in terraform) so I think it probably isn't going to be a hot path anyways.

Comment on lines +339 to +345
{
name: "path with multiple leading slashes",
workspaceFolder: "///home/coder/project",
friendlyName: "friendly-fallback",
depth: 1,
expected: "coder-project",
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For funsies, how about a Windows-style path?

C:\Documents and Settings\My Username\Documents\Code\Some Project Version 3\

We can skip if you don't think that's valuable.

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, nice work!

@DanielleMaywood DanielleMaywood merged commit fcf9371 into main Jun 24, 2025
34 checks passed
@DanielleMaywood DanielleMaywood deleted the dm-devcontainer-retry branch June 24, 2025 18:04
@github-actions github-actions bot locked and limited conversation to collaborators Jun 24, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Devcontainers: Use stable naming for devcontainers
3 participants