Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@mattsu2020
Copy link
Contributor

@mattsu2020 mattsu2020 commented Nov 3, 2025

fix GNU fold-characters.sh test

Added a new FoldContext helper to keep width mode, whitespace handling, buffers, and counters together, letting process_utf8_line and process_non_utf8_line focus solely on folding logic.
Refactored folding helpers to use the shared context and route through the unified emit_output, resolving the Clippy too_many_arguments errors without altering behaviour.

#9127

Add a new --characters flag to the fold utility, allowing it to count using Unicode character positions rather than display columns. This provides more accurate line breaking for text containing wide characters. Includes dependency on unicode-width crate and updated help messages in English and French locales.
…fy function signatures

Introduce a new FoldContext struct to group related fields (spaces, width, mode, writer, output, col_count, last_space) into a single context object. This refactoring reduces parameter passing in emit_output and process_utf8_line functions, improving code readability and maintainability without altering the core folding logic.
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 3, 2025

CodSpeed Performance Report

Merging #9126 will improve performances by 49.37%

Comparing mattsu2020:fold_fix (ff1a312) with main (5318c9c)

Summary

⚡ 2 improvements
✅ 121 untouched
⏩ 5 skipped1

Benchmarks breakdown

Benchmark BASE HEAD Change
fold_custom_width[50000] 43.4 ms 32.1 ms +34.93%
fold_many_lines[100000] 116.2 ms 77.8 ms +49.37%

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

github-actions bot commented Nov 3, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

@sylvestre
Copy link
Contributor

Add process_ascii_line function to handle ASCII bytes efficiently, avoiding UTF-8 overhead for ASCII input. Update emit_output to properly manage output buffer remainder and track last space position for better folding logic. Modify process_utf8_line to delegate ASCII lines to the new function.
Add "rposition" to the cspell jargon dictionary to prevent spell check errors for this technical term.
@github-actions
Copy link

github-actions bot commented Nov 4, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

- Replace fold/writeln! with loop/push_str in benchmarks for faster string building
- Add append_usize helper to avoid allocations in benchmark data generation
- Refactor emit_output to use drain instead of split_off for better performance
- Update last_space calculation to handle index adjustments more efficiently

These changes improve performance in the fold utility's benchmarks and core logic by reducing allocations and optimizing string operations.
…output

Break the inline closure into a multi-line block for better code clarity and maintainability.
The condition for updating the last space index was changed from `idx + 1 <= consume` to `idx < consume` to fix an off-by-one error, ensuring proper handling of spaces when consuming characters during line folding.
…andling

Refactor the `process_ascii_line` function to use a while loop with pattern matching instead of a for loop, improving efficiency and clarity. Introduce `push_ascii_segment` to handle contiguous printable character sequences, ensuring accurate column counting and whitespace tracking in both columns and characters modes. This addresses potential issues with control character processing and width calculations.
@mattsu2020
Copy link
Contributor Author

it significantly regressed the benchmark performances: https://codspeed.io/uutils/coreutils/branches/mattsu2020%3Afold_fix?uri=src%2Fuu%2Ffold%2Fbenches%2Ffold_bench.rs%3A%3Afold_many_lines%5B100000%5D&runnerMode=Instrumentation&sectionId=benchmark-comparison-section-comparison-failed

could you please have a look? thanks

I tried to improve it, but this is the best I can do.

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

@sylvestre
Copy link
Contributor

could you please move the bench changes into a different PR? thanks

@mattsu2020
Copy link
Contributor Author

could you please move the bench changes into a different PR? thanks

#9210

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

Add a new --characters flag to the fold utility, allowing it to count using Unicode character positions rather than display columns. This provides more accurate line breaking for text containing wide characters. Includes dependency on unicode-width crate and updated help messages in English and French locales.
…fy function signatures

Introduce a new FoldContext struct to group related fields (spaces, width, mode, writer, output, col_count, last_space) into a single context object. This refactoring reduces parameter passing in emit_output and process_utf8_line functions, improving code readability and maintainability without altering the core folding logic.
Add process_ascii_line function to handle ASCII bytes efficiently, avoiding UTF-8 overhead for ASCII input. Update emit_output to properly manage output buffer remainder and track last space position for better folding logic. Modify process_utf8_line to delegate ASCII lines to the new function.
Add "rposition" to the cspell jargon dictionary to prevent spell check errors for this technical term.
- Replace fold/writeln! with loop/push_str in benchmarks for faster string building
- Add append_usize helper to avoid allocations in benchmark data generation
- Refactor emit_output to use drain instead of split_off for better performance
- Update last_space calculation to handle index adjustments more efficiently

These changes improve performance in the fold utility's benchmarks and core logic by reducing allocations and optimizing string operations.
…output

Break the inline closure into a multi-line block for better code clarity and maintainability.
The condition for updating the last space index was changed from `idx + 1 <= consume` to `idx < consume` to fix an off-by-one error, ensuring proper handling of spaces when consuming characters during line folding.
…andling

Refactor the `process_ascii_line` function to use a while loop with pattern matching instead of a for loop, improving efficiency and clarity. Introduce `push_ascii_segment` to handle contiguous printable character sequences, ensuring accurate column counting and whitespace tracking in both columns and characters modes. This addresses potential issues with control character processing and width calculations.
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

- Simplify fold_file by sharing FoldContext construction
- Reduces duplicated code and improves maintainability without behavior change
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/fold/fold-characters is no longer failing!
Congrats! The gnu test tests/fold/fold-nbsp is no longer failing!

@mattsu2020 mattsu2020 requested a review from sylvestre November 13, 2025 10:36
@sylvestre sylvestre merged commit 4a48c9e into uutils:main Nov 13, 2025
127 checks passed
@mattsu2020 mattsu2020 deleted the fold_fix branch November 13, 2025 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants