Add deterministic per-file timing summary to sqllogictest runner#20569
Add deterministic per-file timing summary to sqllogictest runner#20569kosiew merged 12 commits intoapache:mainfrom
Conversation
Capture elapsed time once in spawned per-file task and reuse after join. Remove redundant post-join measurement while maintaining existing error behavior. Implement safe fallback to Duration::ZERO for join-level panics or errors where elapsed time is not available.
Ensure --timing-top-n accepts only values >= 1 by using clap's value parser with a defined range. Update help text to reflect this new requirement and clarify in README.md to avoid silent runtime coercion.
Clarify default behavior for timing summaries in TTY and non-TTY/CI runs. Maintain conciseness within the existing timing-summary section.
Replace Clap parser call in sqllogictests.rs:949 with a custom parser function. Add validation to ensure usize values are >= 1 in lines 433-443, providing a clear error message for any input of 0.
|
|
||
| let top_n = options.timing_top_n; | ||
| let count = match mode { | ||
| TimingSummaryMode::Off => 0, |
There was a problem hiding this comment.
nit: This is already handled at line 391.
There was a problem hiding this comment.
Good catch. I will remove the redundant TimingSummaryMode::Off handling from the count calculation since Off already returns early.
| let top_n = options.timing_top_n; | ||
| let count = match mode { | ||
| TimingSummaryMode::Off => 0, | ||
| TimingSummaryMode::Auto | TimingSummaryMode::Top => top_n, |
There was a problem hiding this comment.
nit: mode cannot be TimingSummaryMode::Auto because Options::timing_summary_mode() does not return it. But it is not doing any harm either.
There was a problem hiding this comment.
Agreed. timing_summary_mode() normalizes Auto to Top/Off before this point.
I will update the branch logic to only rely on Top vs Full and add a debug_assert! to document/enforce that invariant in debug builds.
|
Thanks @martin-g for the quick review. |
alamb
left a comment
There was a problem hiding this comment.
This feature will be super helpful to try and schedule the tests more carefully if we go with #20576
One quick thought I had while skimming this PR was I wonder if we really need all the different modes.
It seems the key thing that we can't do without changes to sqllogictests itself is get the per-file timing. However, everything else we could do with post run scripts,.
For example, rather than adding a special flag --timings-top-n 10 maybe we could follow the unix philosophy and pipe the output to head -n 10
cargo test --test sqllogictests -- --timing-summary | head -n 10Just a thought to keep the code a bit simpler
| ColorChoice::Auto => { | ||
| // CARGO_TERM_COLOR takes precedence over auto-detection | ||
| let cargo_term_color = ColorChoice::from_str( | ||
| let cargo_term_color = <ColorChoice as FromStr>::from_str( |
There was a problem hiding this comment.
It is needed because clap::ValueEnum is in scope too.
https://docs.rs/clap/latest/clap/trait.ValueEnum.html#method.from_str
Streamline timing summary to a single switch, enabling full deterministic per-file timings sorted slowest-first. Eliminate all mode and top-N options in sqllogictests.rs, including the removal of TimingSummaryMode and related auto branching for summary output. Update README.md to recommend Unix post-processing with `| head -n 10`.
Agreed and simplified. |
|
hmmm.... It's not as straightforward as I thought. |
This reverts commit 2eb94d4.
looks good -- thanks! If possibility is to avoid printing progress when in "timing mode" and only print out the overall runtime |
Which issue does this PR close?
Rationale for this change
The sqllogictest runner executes files in parallel, but it was hard to pinpoint which test files dominate wall-clock time. This change adds deterministic per-file elapsed timing observability so we can identify long-tail files and prioritize follow-up optimization work, while keeping default output usable for both local development (TTY) and CI (non-TTY).
What changes are included in this PR?
Collect per-file elapsed durations in the sqllogictest runner and aggregate them at end-of-run.
Print a deterministic timing summary (stable sort: elapsed desc, path asc; stable formatting) via
MultiProgressto avoid interleaved progress-bar noise.Add CLI flags and environment variables to control output:
--timing-summary auto|off|top|full(alsoSLT_TIMING_SUMMARY)--timing-top-n <N>(alsoSLT_TIMING_TOP_N, must be>= 1)Default behavior:
automaps toofffor local TTY runs andtopfor CI/non-TTY runs.Add optional debug logging for slow files (over 30s) behind
SLT_TIMING_DEBUG_SLOW_FILES=1.Update
datafusion/sqllogictest/README.mdwith usage examples.Are these changes tested?
Covered by existing
sqllogictestsintegration test execution; no new unit tests were added.Manual validation plan (ran locally / in CI as applicable):
cargo test --test sqllogictests -- push_down_filter_ --test-threads 16cargo test --test sqllogictests -- --test-threads 16cargo test --test sqllogictests -- --timing-summary top --timing-top-n 10cargo test --test sqllogictests -- --timing-summary fullVerified output properties:
automode is quiet on TTY but prints a top-N summary on non-TTY/CI.Are there any user-facing changes?
Yes (test-runner UX only):
New optional timing summary output for
sqllogictests.New CLI flags / env vars documented in
datafusion/sqllogictest/README.md:--timing-summary auto|off|top|full/SLT_TIMING_SUMMARY--timing-top-n <N>/SLT_TIMING_TOP_NSLT_TIMING_DEBUG_SLOW_FILES=1(optional debug logging for slow files >30s)No public DataFusion APIs are changed.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.