Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Comments

Cache ACR tokens in deployment state to avoid redundant authentication#12792

Closed
Copilot wants to merge 6 commits intomainfrom
copilot/cache-access-token-deployment-state
Closed

Cache ACR tokens in deployment state to avoid redundant authentication#12792
Copilot wants to merge 6 commits intomainfrom
copilot/cache-access-token-deployment-state

Conversation

Copy link
Contributor

Copilot AI commented Nov 7, 2025

Description

ACR login performs OAuth2 token exchange on every deployment, causing unnecessary latency and potential rate limiting. This caches refresh tokens in deployment state with expiration tracking.

Implementation:

  • Section per registry: AcrTokens:{registry} sections store tokens keyed by tenantId
  • Single I/O path: Acquire section once, save only when obtaining fresh token with expiration
  • Automatic recovery: 401/403 responses from cached tokens trigger fresh token acquisition
  • Expiration safety: 5-minute buffer accounts for clock skew
  • Conservative caching: Tokens are only cached when ACR provides an expiration time (no default TTL)
  • Accurate timestamps: Expiration calculated immediately after receiving token from server to minimize time drift
  • Testable time: Uses TimeProvider for time operations, enabling deterministic testing with FakeTimeProvider
  • Efficient serialization: CachedToken.ToJsonNode() manually constructs JSON for optimal performance

Example state structure:

{
  "AcrTokens:myregistry_azurecr_io": {
    "tenant-1": {
      "refresh_token": "...",
      "expires_at_utc": "2025-11-07T10:30:00Z"
    }
  }
}

Testing improvements:

  • All time-based tests use FakeTimeProvider.Advance() instead of Task.Delay
  • No more multi-minute test delays - all 8 tests run instantly
  • Fully deterministic time-based test coverage

Fixes #(issue)

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?
Original prompt

AcrLoginService should cache the access token in deployment state using IDeploymentStateManager along with the ExpiresIn. We shouldn't need to login as long as the token is valid.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits November 7, 2025 07:50
- Add IDeploymentStateManager dependency to AcrLoginService
- Cache refresh tokens with expiration timestamps in deployment state
- Check for valid cached tokens before performing fresh login
- Store tokens in "AcrTokens" section keyed by registry+tenant
- Add 5-minute safety margin to expiration checks
- Default to 3-hour expiration if not provided by ACR
- Gracefully handle caching failures without breaking login
- Add comprehensive test coverage for caching behavior

Co-authored-by: davidfowl <[email protected]>
- Acquire deployment state section once per login operation
- Use registry URL as section name (not combined with tenant)
- Store tokens keyed by tenantId within each registry section
- Only save section when acquiring fresh token (not when using cached)
- Handle 401/403 errors from cached tokens and retry with fresh token
- Add test for 401 retry behavior
- Update all existing tests to match new section naming scheme

Co-authored-by: davidfowl <[email protected]>
Copilot AI changed the title [WIP] Add caching of access token in deployment state Cache ACR tokens in deployment state to avoid redundant authentication Nov 7, 2025
Copilot AI requested a review from davidfowl November 7, 2025 08:11
@davidfowl
Copy link
Member

@copilot Use TimeProvider so we can test the time without sleeping. Use FakeTimeProvider to test it.

- Inject TimeProvider instead of using DateTime.UtcNow directly
- Update all tests to use FakeTimeProvider for time-based testing
- Add ToJsonNode method on CachedToken that manually constructs JSON
- Remove skipped test - now uses FakeTimeProvider.Advance() instead of Task.Delay
- All 8 tests now passing without requiring long delays

Co-authored-by: davidfowl <[email protected]>
Copy link
Contributor Author

Copilot AI commented Nov 7, 2025

@copilot Use TimeProvider so we can test the time without sleeping. Use FakeTimeProvider to test it.

Done. Updated to use TimeProvider for all time operations and FakeTimeProvider in tests. Also added ToJsonNode() method on CachedToken that manually constructs the JSON node. All 8 tests now passing without delays (commit 0b22b0b).

@davidfowl davidfowl marked this pull request as ready for review November 8, 2025 23:34
Copilot AI review requested due to automatic review settings November 8, 2025 23:34
@github-actions
Copy link
Contributor

github-actions bot commented Nov 8, 2025

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 12792

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 12792"

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds token caching functionality to the AcrLoginService to improve performance and reduce unnecessary authentication requests when logging into Azure Container Registry. The caching mechanism uses the IDeploymentStateManager to persist tokens with expiration tracking, and includes retry logic for rejected cached tokens.

Key changes:

  • Added token caching using IDeploymentStateManager to store ACR refresh tokens with expiration times
  • Implemented expiration-aware token reuse with a 5-minute safety margin for clock skew
  • Added retry logic to fetch fresh tokens when cached tokens are rejected with 401/403 errors

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/Aspire.Hosting.Azure/AcrLoginService.cs Added token caching infrastructure with IDeploymentStateManager and TimeProvider dependencies; implemented cache lookup, validation, and storage logic with retry on authentication failures
tests/Aspire.Hosting.Azure.Tests/AcrLoginServiceTests.cs Comprehensive test suite covering token caching scenarios including cache hits/misses, expiration, multi-tenant/multi-registry support, retry on 401/403, and graceful degradation on cache failures

Comment on lines 57 to 61
return new System.Text.Json.Nodes.JsonObject
{
["refresh_token"] = RefreshToken,
["expires_at_utc"] = ExpiresAtUtc
};
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ToJsonNode() method uses System.Text.Json.Nodes.JsonObject with a fully qualified name, which is inconsistent with the style elsewhere in the file. Consider adding using System.Text.Json.Nodes; at the top of the file and using just JsonObject for consistency with the rest of the codebase.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added using System.Text.Json.Nodes; and using JsonObject directly (commit d313451).

Comment on lines +132 to +133
catch (InvalidOperationException ex) when (ex.Message.Contains("401") || ex.Message.Contains("403") ||
ex.Message.Contains("Unauthorized") || ex.Message.Contains("Forbidden"))
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching exceptions and checking their message strings is fragile and error-prone. Consider documenting which container runtimes throw InvalidOperationException with these specific messages, or refactor the container runtime interface to throw more specific exception types that can be caught directly.

Copilot uses AI. Check for mistakes.
Comment on lines 285 to 286
public static Task<bool> InspectImageAsync(string _, CancellationToken __ = default) => Task.FromResult(true);
public static Task PullImageAsync(string _, CancellationToken __ = default) => Task.CompletedTask;
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods are declared as static but they're part of a class implementing IContainerRuntime interface. If the interface declares these as instance methods, this will cause a compilation error. These should be instance methods, not static.

Suggested change
public static Task<bool> InspectImageAsync(string _, CancellationToken __ = default) => Task.FromResult(true);
public static Task PullImageAsync(string _, CancellationToken __ = default) => Task.CompletedTask;
public Task<bool> InspectImageAsync(string _, CancellationToken __ = default) => Task.FromResult(true);
public Task PullImageAsync(string _, CancellationToken __ = default) => Task.CompletedTask;

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed static modifier from both methods (commit d313451).

Comment on lines 203 to 205
// Use the registry endpoint as the section name
// Replace dots and other characters that might not be suitable for section names
return $"AcrTokens:{registryEndpoint.Replace('.', '_')}";
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions 'dots and other characters' but the code only replaces dots. Either update the comment to accurately reflect what's being replaced (just dots), or add handling for other potentially problematic characters if they exist.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated comment to say "Replace dots with underscores" instead of "dots and other characters" (commit d313451).

Comment on lines 257 to 260
// Default to 3 hours (10800 seconds) if not provided by ACR
var expiresIn = tokenResponse.ExpiresIn ?? 10800;

return (tokenResponse.RefreshToken, expiresIn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - This is a bad assumption. Change the code to return null for the expiresIn in this case, and handle it in the callers by not caching it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to return null for expiresIn when not provided by ACR. Tokens without expiration are now not cached at all. Updated test to verify this behavior (commit d313451).

var newCachedToken = new CachedToken
{
RefreshToken = refreshToken,
ExpiresAtUtc = _timeProvider.GetUtcNow().AddSeconds(expiresIn).UtcDateTime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - we should do this calculation as soon as possible after getting the expiresIn value from the server. We shouldn't be doing operations in between.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the expiration calculation to immediately after receiving the expiresIn value from the server, before any logging or section acquisition operations (commit 583d794).

- Add using System.Text.Json.Nodes and simplify JsonObject usage
- Fix static modifier on interface methods in test class
- Update comment to accurately reflect only dots are replaced
- Change expiresIn handling to return null instead of defaulting to 3 hours
- Skip caching when ACR doesn't provide expiration time
- Update test to verify tokens aren't cached without expiration
- Add CultureInfo.InvariantCulture to ToString() call

Co-authored-by: eerhardt <[email protected]>
Move the ExpiresAtUtc calculation to immediately after getting the expiresIn value from the server, before any logging or other operations. This minimizes time drift between when the server provides the expiration time and when we calculate the actual timestamp.

Co-authored-by: eerhardt <[email protected]>
@dotnet-policy-service dotnet-policy-service bot added this to the 13.2 milestone Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants