Thanks to visit codestin.com
Credit goes to github.com

Skip to content

proposal: runtime/pprof: structured, programmatic access to go race reports #75434

@chabbimilind

Description

@chabbimilind

Proposal Details

The current methods for handling race detector output are fragile and cumbersome, leading to significant developer friction, repeated work, and missed opportunities for automated bug detection.

  1. Fragile stderr output of TSAN: Race reports or go program compiled with -race are long, multiline strings that are easily corrupted or scattered when interleaved with other logs. Parsing this unstructured text is brittle and prone to errors. At Uber, our stderr logs are consumed by a logging system which splits logs by the line separator and hence race stacks get scattered making it nearly impossible to reconstruct them from production machines that run race-enabled binaries. (Here the use case is running program with -race flag and halt_on_error=0 to find production races).

  2. The GORACE=log_path=file, redirects the race output to a file to avoid log corruption. However, It requires a separate, complex process to periodically poll the file, handle partial writes, and reliably parse each report. This adds overhead and complexity to deployment and monitoring pipelines.

  3. Brittle log parsing: The reliance on external log parsers is inherently fragile. Small changes in the report format can break log-scraping tools, requiring constant maintenance and reducing the reliability of automated race detection in CI/CD and production.

Proposed Solution: A Standard, Structured API

A new API is require to provide a robust and durable solution, enabling Go programs to directly consume race reports as structured data. runtime/pprof is the ideal location for this API, as it is the established home for profiling and diagnostic tools.

The API should expose a structured report object that includes, at a minimum:

  1. A pair of structured stack traces: One for each of the conflicting accesses along with the goroutine ancestry.
  2. The memory address of the conflicting access (more on this below).
  3. The goroutine IDs involved.

Enhancing Race Debugging with Allocation Site Information

While the core of this proposal is the primary objective, we must also address a major pain point in race debugging: the difficulty of identifying the object involved. Currently, the race detector can't easily link a race to the stack trace that allocated the racy object.

Therefore, this proposal also suggests the long-term goal of exposing the allocation site stack within the structured report. This would enable a full, end-to-end view of the race, from the object's creation to its conflicting accesses, which would dramatically reduce the time spent on root cause analysis.

Metadata

Metadata

Assignees

No one assigned

    Labels

    LibraryProposalIssues describing a requested change to the Go standard library or x/ libraries, but not to a toolProposal

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions