fix(fold): GNU fold-characters.sh test #9126

mattsu2020 · 2025-11-03T08:29:01Z

fix GNU fold-characters.sh test

Added a new FoldContext helper to keep width mode, whitespace handling, buffers, and counters together, letting process_utf8_line and process_non_utf8_line focus solely on folding logic.
Refactored folding helpers to use the shared context and route through the unified emit_output, resolving the Clippy too_many_arguments errors without altering behaviour.

#9127

Add a new --characters flag to the fold utility, allowing it to count using Unicode character positions rather than display columns. This provides more accurate line breaking for text containing wide characters. Includes dependency on unicode-width crate and updated help messages in English and French locales.

…fy function signatures Introduce a new FoldContext struct to group related fields (spaces, width, mode, writer, output, col_count, last_space) into a single context object. This refactoring reduces parameter passing in emit_output and process_utf8_line functions, improving code readability and maintainability without altering the core folding logic.

codspeed-hq · 2025-11-03T08:41:27Z

CodSpeed Performance Report

Merging #9126 will improve performances by 49.37%

_{Comparing mattsu2020:fold_fix (ff1a312) with main (5318c9c)}

Summary

⚡ 2 improvements
✅ 121 untouched
⏩ 5 skipped¹

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`fold_custom_width[50000]`	43.4 ms	32.1 ms	+34.93%
⚡	`fold_many_lines[100000]`	116.2 ms	77.8 ms	+49.37%

5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions · 2025-11-03T08:50:50Z

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

sylvestre · 2025-11-03T21:17:58Z

it significantly regressed the benchmark performances:
https://codspeed.io/uutils/coreutils/branches/mattsu2020%3Afold_fix?uri=src%2Fuu%2Ffold%2Fbenches%2Ffold_bench.rs%3A%3Afold_many_lines%5B100000%5D&runnerMode=Instrumentation&sectionId=benchmark-comparison-section-comparison-failed

could you please have a look? thanks

Add process_ascii_line function to handle ASCII bytes efficiently, avoiding UTF-8 overhead for ASCII input. Update emit_output to properly manage output buffer remainder and track last space position for better folding logic. Modify process_utf8_line to delegate ASCII lines to the new function.

Add "rposition" to the cspell jargon dictionary to prevent spell check errors for this technical term.

github-actions · 2025-11-04T00:13:58Z

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

- Replace fold/writeln! with loop/push_str in benchmarks for faster string building - Add append_usize helper to avoid allocations in benchmark data generation - Refactor emit_output to use drain instead of split_off for better performance - Update last_space calculation to handle index adjustments more efficiently These changes improve performance in the fold utility's benchmarks and core logic by reducing allocations and optimizing string operations.

…output Break the inline closure into a multi-line block for better code clarity and maintainability.

The condition for updating the last space index was changed from `idx + 1 <= consume` to `idx < consume` to fix an off-by-one error, ensuring proper handling of spaces when consuming characters during line folding.

…andling Refactor the `process_ascii_line` function to use a while loop with pattern matching instead of a for loop, improving efficiency and clarity. Introduce `push_ascii_segment` to handle contiguous printable character sequences, ensuring accurate column counting and whitespace tracking in both columns and characters modes. This addresses potential issues with control character processing and width calculations.

mattsu2020 · 2025-11-04T01:05:20Z

it significantly regressed the benchmark performances: https://codspeed.io/uutils/coreutils/branches/mattsu2020%3Afold_fix?uri=src%2Fuu%2Ffold%2Fbenches%2Ffold_bench.rs%3A%3Afold_many_lines%5B100000%5D&runnerMode=Instrumentation&sectionId=benchmark-comparison-section-comparison-failed

could you please have a look? thanks

I tried to improve it, but this is the best I can do.

github-actions · 2025-11-04T01:18:08Z

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

sylvestre · 2025-11-08T22:51:29Z

could you please move the bench changes into a different PR? thanks

mattsu2020 · 2025-11-10T00:57:16Z

could you please move the bench changes into a different PR? thanks

#9210

github-actions · 2025-11-11T00:23:37Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!