feat: drilldown config endpoint #19028

trevorwhitney · 2025-08-26T22:55:15Z

What this PR does / why we need it:

This PR adds a new endpoint at /loki/api/v1/config that exposes a mix of tenant limits and config that is useful to Drilldown to make decisions to improve the user experience.

This extends the work done in #18521, which was originally done for this purpose (of supporting Drilldown), but in practice has two major drawbacks.

First, the tenant limits endpoint is at /config/tenant/v1/limits which is a path outside of what the current Loki datasource can access, so in order for Drilldown to use there needs to be changes to core Grafana.

Second, that endpoint only returns limits, and there is also some config Drilldown needs to know about (such as if the pattern ingesters are enabled). It could be possible to move this to a per-tenant config, but this PR is a bit more future proof in case there are configs added in the future that are relevant to Drilldown.

Special notes for your reviewer:

Checklist

Reviewed the CONTRIBUTING.md guide (required)
Documentation added
Tests updated
Title matches the required conventional commits format, see here
- Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

claude · 2025-08-29T17:07:52Z

Claude finished @trevorwhitney's task —— View job

PR Review Complete ✅

I've completed a comprehensive review of PR #19028: feat: drilldown config endpoint

📋 Review Summary

This PR successfully adds a new /loki/api/v1/config endpoint that provides a JSON response containing filtered tenant limits and pattern ingester configuration for Drilldown. The implementation is solid with good test coverage.

✅ Positive Aspects

Code Quality & Architecture

Clean separation of concerns with dedicated DrilldownConfigResponse struct
Consistent error handling patterns throughout
Good code organization and readability
Proper use of existing Loki patterns and interfaces

Security & Authorization

✅ Proper tenant ID extraction using tenant.ExtractTenantIDFromHTTPRequest
✅ Respects the existing TenantLimitsAllowPublish allowlist for field filtering
✅ Graceful degradation when tenant limits are unavailable
✅ Consistent authorization model with existing endpoints

Test Coverage

✅ Comprehensive test suite covering all major scenarios
✅ Tests for fallback behavior (TenantLimits → Overrides → defaults)
✅ Tests for allowlist filtering functionality
✅ Tests for pattern ingester enabled/disabled states
✅ Edge cases covered (missing tenants, empty configs)
✅ Proper use of table-driven tests

🔍 Detailed Technical Analysis

Implementation Quality

pkg/loki/drilldown_config.go:9 - Simple, focused struct with clear field mappings

type DrilldownConfigResponse struct {
    Limits                 map[string]any `json:"limits"`
    PatternIngesterEnabled bool           `json:"pattern_ingester_enabled"`
    Version                string         `json:"version"`
}

pkg/loki/config_handler.go:193-243 - Handler follows Loki conventions:

Proper error handling with appropriate HTTP status codes
Tenant-aware limit resolution with fallback chain
JSON response with correct content-type
Uses existing filterLimitFields for consistency

Limit Resolution Logic

The fallback chain is implemented correctly:

Try TenantLimits.TenantLimits(user) for per-tenant overrides
Fall back to Overrides.DefaultLimits() for defaults
Error if no limits are available

This matches the expected behavior described in the tests and follows Loki's multi-tenancy patterns.

Performance Considerations

✅ Efficient O(1) allowlist lookup using map
✅ Direct JSON marshaling of validation.Limits (has proper json tags)
✅ No unnecessary data copying or transformations
✅ Lightweight endpoint suitable for frequent polling

Pattern Integration

✅ Clean integration with t.Cfg.Pattern.Enabled
✅ Provides necessary config for Drilldown UI decisions

🚦 Minor Observations

Version Handling (pkg/loki/config_handler.go:226-229)

version := build.GetVersion().Version
if version == "" {
    version = "unknown"
}

Good defensive programming - handles edge cases where version might be empty.

Error Messages

All error messages are clear and actionable:

"No default limits configured" (line 213)
Clear tenant extraction errors propagated from dskit

🎯 Security Assessment

✅ No Security Concerns Identified

Uses existing tenant isolation mechanisms
Respects configured field allowlists
No credential exposure risk
Same authorization model as existing config endpoints
No potential for privilege escalation

📊 Test Coverage Analysis

The test suite is exemplary with:

Multiple test files covering different aspects
527 lines of comprehensive test code
Edge case coverage including missing tenants, empty configs
Mock implementations properly simulating real behavior
Table-driven tests for maintainability
Response validation checking both structure and content

🎯 Recommendation

✅ APPROVED - This PR is well-implemented and ready for merge.

Why This PR Succeeds:

Addresses real need: Solves Drilldown's path accessibility issue vs /config/tenant/v1/limits
Future-proof: Extensible for additional config beyond just limits
Consistent: Follows established Loki patterns and conventions
Thoroughly tested: Comprehensive test coverage with edge cases
Secure: Proper tenant isolation and field filtering
Performant: Efficient implementation suitable for production use

No changes required - this implementation demonstrates good software engineering practices and is ready for production deployment.

trevorwhitney · 2025-08-29T17:13:03Z

@claude are you there? this is a test

claude · 2025-08-29T17:13:33Z

Claude finished @trevorwhitney's task —— View job

Yes, I'm here and working!

Todo List

Respond to test message

Test confirmed - Claude Code is operational and ready to help with your Loki development tasks!

gtk-grafana · 2025-09-03T15:33:16Z

@trevorwhitney I'm not seeing volume_enabled getting pulled through when running locally. Is this something we're able to add to this PR?

gtk-grafana · 2025-09-03T16:37:38Z

A few more questions I asked in Slack that might be nice to have greater visibility on:

All of the limits we’re returning (i.e. max_query_length, max_query_lookback, max_query_range) seem to be duration, are there limits associated with bytes processed that we might want to expose?

pattern_persistence_enabled is in limits, but pattern_ingester_enabledis a top level property, is this on purpose?

otlp_config contains AttributesConfig properties that appear identical, same action, same attributes.
Why are there two entires? What does it mean?

{
  "otlp_config": {
      "LogAttributes": null,
      "ResourceAttributes": {
          "AttributesConfig": [
              {
                  "Action": "index_label",
                  "Attributes": [
                      "service.name",
                      "service.namespace",
                      "service.instance.id",
                      "deployment.environment",
                      "deployment.environment.name",
                      "cloud.region",
                      "cloud.availability_zone",
                      "k8s.cluster.name",
                      "k8s.namespace.name",
                      "k8s.pod.name",
                      "k8s.container.name",
                      "container.name",
                      "k8s.replicaset.name",
                      "k8s.deployment.name",
                      "k8s.statefulset.name",
                      "k8s.daemonset.name",
                      "k8s.cronjob.name",
                      "k8s.job.name"
                  ],
                  "Regex": ""
              },
              {
                  "Action": "index_label",
                  "Attributes": [
                      "service.name",
                      "service.namespace",
                      "service.instance.id",
                      "deployment.environment",
                      "deployment.environment.name",
                      "cloud.region",
                      "cloud.availability_zone",
                      "k8s.cluster.name",
                      "k8s.namespace.name",
                      "k8s.pod.name",
                      "k8s.container.name",
                      "container.name",
                      "k8s.replicaset.name",
                      "k8s.deployment.name",
                      "k8s.statefulset.name",
                      "k8s.daemonset.name",
                      "k8s.cronjob.name",
                      "k8s.job.name"
                  ],
                  "Regex": ""
              }
          ],
          "IgnoreDefaults": false
      },
      "ScopeAttributes": null,
      "SeverityTextAsLabel": false
  }
}

gtk-grafana · 2025-09-04T15:17:09Z

All of the limits we’re returning (i.e. max_query_length, max_query_lookback, max_query_range) seem to be duration, are there limits associated with bytes processed that we might want to expose?

A: there are, but you'd need to make a case for them, because the argument to not expose them is that an operator may not want a user to discover certain limits on their queries, as they may request the operator to bump them

If discover_log_levels is false, can we assume that level is a label?

A: no, we can never assume anything about a customer's labeling strategy. discover log level was implemented to give us a consistent place we could assume level will be (detected_level).

pattern_persistence_enabled is in limits, but pattern_ingester_enabled is a top level property, is this on purpose?

A: yes, the latter is a Loki config, and the whole reason we needed an endpoint beyond the tenant limits endpoint

otlp_config contains AttributesConfig properties that appear identical, same action, same attributes.
Why are there two entires? What does it mean?

A: that's a good question, I'm not sure, I'd have to poke around a bit to see if this happens everywhere or is an artifact of your config.

gtk-grafana · 2025-09-10T12:05:14Z

@trevorwhitney can we add max_entries_limit_per_query?

gtk-grafana · 2025-09-11T15:44:12Z

And max_query_bytes_read?

trevorwhitney · 2025-09-15T23:02:51Z

@gtk-grafana I added the extra properties to the default allow list. I also confirmed the duplicate otlp config you noticed. It looks like we're registering the defaults twice, here and here. I'm investigating why.

github-actions · 2025-09-17T22:43:17Z

💻 Deploy preview deleted.

trevorwhitney · 2025-09-24T17:10:24Z

chatted with @salvacorts about this one today. we're going to move the endpoint to something more drilldown specific to make it more clear what it's for (ie api/v1/drilldown-limits), and going to try and consolidate the logic a bit better with the existing limits endpoint.

trevorwhitney · 2025-10-08T21:59:42Z

pkg/loki/config_handler.go

-		if t.TenantLimits == nil {
-			http.Error(w, "Tenant configs not enabled", http.StatusNotFound)
-			return
-		}
-


this logic was moved, because previously defaults weren't working unless there was a runtime config file configured.

salvacorts · 2025-10-09T07:22:46Z

pkg/loki/config_handler.go

+				return
+			}
+		} else {
+			writeYAMLResponse(w, filteredLimits)


super nit so feel free to disregard: it would be more idomatic to return early and avoid this if-else

if !forDrilldown { writeYAMLResponse(w, filteredLimits) return } // Hereafter, we know the response should be for drilldown so we build the custom response version := build.GetVersion().Version if version == "" { version = "unknown" } ...

yep, I agree, I'll make that change!

trevorwhitney requested a review from a team as a code owner August 26, 2025 22:55

pull-request-size bot added the size/XL label Aug 26, 2025

gtk-grafana mentioned this pull request Aug 27, 2025

chore: add tenant id to generator, and enable auth in loki grafana/logs-drilldown#1519

Merged

trevorwhitney added the claude-review Requests a review from Claude Code when added label Aug 29, 2025

This was referenced Sep 2, 2025

Loki: resource frontend browser cache grafana/grafana#110488

Merged

docs: local multitenant config #19049

Closed

gtk-grafana mentioned this pull request Sep 3, 2025

feat(config): Support Loki config API endpoint grafana/logs-drilldown#1526

Merged

trevorwhitney added 6 commits September 17, 2025 16:30

feat: new config endpoint for Logs Drilldown

1c06b3d

chore: revert unndeeded changes

78ea12a

chore: removed unused code

9e9307f

chore: consolidated test cases

29fd9f4

feat: add a few more configs to the default allow list

a746358

fix: lint and format

c0d7feb

trevorwhitney force-pushed the dirlldown-config-endpoint branch from 23ee772 to c0d7feb Compare September 17, 2025 22:34

docs: regen docs

63303b6

trevorwhitney added 3 commits October 8, 2025 15:00

chore: move drilldown limits route, reuse tenant limits handler

d83a4d8

Merge branch 'main' into dirlldown-config-endpoint

7c13324

chore: simplify a bit more

d525510

trevorwhitney commented Oct 8, 2025

View reviewed changes

salvacorts approved these changes Oct 9, 2025

View reviewed changes

trevorwhitney added 2 commits October 9, 2025 10:07

refactor: use more idiomic early return

d093525

Merge branch 'main' into dirlldown-config-endpoint

2267458

trevorwhitney merged commit 52b5d95 into main Oct 9, 2025
65 checks passed

trevorwhitney deleted the dirlldown-config-endpoint branch October 9, 2025 23:11

trevorwhitney mentioned this pull request Oct 20, 2025

Per-tenant configuration endpoint #16903

Closed

feat: drilldown config endpoint #19028

feat: drilldown config endpoint #19028

Uh oh!

Conversation

trevorwhitney commented Aug 26, 2025

Uh oh!

claude bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Complete ✅

📋 Review Summary

✅ Positive Aspects

Code Quality & Architecture

Security & Authorization

Test Coverage

🔍 Detailed Technical Analysis

Implementation Quality

Limit Resolution Logic

Performance Considerations

Pattern Integration

🚦 Minor Observations

Version Handling (pkg/loki/config_handler.go:226-229)

Error Messages

🎯 Security Assessment

📊 Test Coverage Analysis

🎯 Recommendation

Why This PR Succeeds:

Uh oh!

trevorwhitney commented Aug 29, 2025

Uh oh!

claude bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Todo List

Uh oh!

gtk-grafana commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gtk-grafana commented Sep 3, 2025

Uh oh!

gtk-grafana commented Sep 4, 2025

Uh oh!

gtk-grafana commented Sep 10, 2025

Uh oh!

gtk-grafana commented Sep 11, 2025

Uh oh!

trevorwhitney commented Sep 15, 2025

Uh oh!

github-actions bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trevorwhitney commented Sep 24, 2025

Uh oh!

trevorwhitney Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

salvacorts Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

trevorwhitney Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude bot commented Aug 29, 2025 •

edited

Loading

claude bot commented Aug 29, 2025 •

edited

Loading

gtk-grafana commented Sep 3, 2025 •

edited

Loading

github-actions bot commented Sep 17, 2025 •

edited

Loading