-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
fix(fold): GNU fold-characters.sh test #9126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add a new --characters flag to the fold utility, allowing it to count using Unicode character positions rather than display columns. This provides more accurate line breaking for text containing wide characters. Includes dependency on unicode-width crate and updated help messages in English and French locales.
…fy function signatures Introduce a new FoldContext struct to group related fields (spaces, width, mode, writer, output, col_count, last_space) into a single context object. This refactoring reduces parameter passing in emit_output and process_utf8_line functions, improving code readability and maintainability without altering the core folding logic.
CodSpeed Performance ReportMerging #9126 will improve performances by 49.37%Comparing Summary
Benchmarks breakdown
Footnotes
|
|
GNU testsuite comparison: |
|
it significantly regressed the benchmark performances: could you please have a look? thanks |
Add process_ascii_line function to handle ASCII bytes efficiently, avoiding UTF-8 overhead for ASCII input. Update emit_output to properly manage output buffer remainder and track last space position for better folding logic. Modify process_utf8_line to delegate ASCII lines to the new function.
Add "rposition" to the cspell jargon dictionary to prevent spell check errors for this technical term.
|
GNU testsuite comparison: |
- Replace fold/writeln! with loop/push_str in benchmarks for faster string building - Add append_usize helper to avoid allocations in benchmark data generation - Refactor emit_output to use drain instead of split_off for better performance - Update last_space calculation to handle index adjustments more efficiently These changes improve performance in the fold utility's benchmarks and core logic by reducing allocations and optimizing string operations.
…output Break the inline closure into a multi-line block for better code clarity and maintainability.
The condition for updating the last space index was changed from `idx + 1 <= consume` to `idx < consume` to fix an off-by-one error, ensuring proper handling of spaces when consuming characters during line folding.
…andling Refactor the `process_ascii_line` function to use a while loop with pattern matching instead of a for loop, improving efficiency and clarity. Introduce `push_ascii_segment` to handle contiguous printable character sequences, ensuring accurate column counting and whitespace tracking in both columns and characters modes. This addresses potential issues with control character processing and width calculations.
I tried to improve it, but this is the best I can do. |
|
GNU testsuite comparison: |
|
could you please move the bench changes into a different PR? thanks |
|
|
GNU testsuite comparison: |
Add a new --characters flag to the fold utility, allowing it to count using Unicode character positions rather than display columns. This provides more accurate line breaking for text containing wide characters. Includes dependency on unicode-width crate and updated help messages in English and French locales.
…fy function signatures Introduce a new FoldContext struct to group related fields (spaces, width, mode, writer, output, col_count, last_space) into a single context object. This refactoring reduces parameter passing in emit_output and process_utf8_line functions, improving code readability and maintainability without altering the core folding logic.
Add process_ascii_line function to handle ASCII bytes efficiently, avoiding UTF-8 overhead for ASCII input. Update emit_output to properly manage output buffer remainder and track last space position for better folding logic. Modify process_utf8_line to delegate ASCII lines to the new function.
Add "rposition" to the cspell jargon dictionary to prevent spell check errors for this technical term.
- Replace fold/writeln! with loop/push_str in benchmarks for faster string building - Add append_usize helper to avoid allocations in benchmark data generation - Refactor emit_output to use drain instead of split_off for better performance - Update last_space calculation to handle index adjustments more efficiently These changes improve performance in the fold utility's benchmarks and core logic by reducing allocations and optimizing string operations.
…output Break the inline closure into a multi-line block for better code clarity and maintainability.
The condition for updating the last space index was changed from `idx + 1 <= consume` to `idx < consume` to fix an off-by-one error, ensuring proper handling of spaces when consuming characters during line folding.
…andling Refactor the `process_ascii_line` function to use a while loop with pattern matching instead of a for loop, improving efficiency and clarity. Introduce `push_ascii_segment` to handle contiguous printable character sequences, ensuring accurate column counting and whitespace tracking in both columns and characters modes. This addresses potential issues with control character processing and width calculations.
|
GNU testsuite comparison: |
- Simplify fold_file by sharing FoldContext construction - Reduces duplicated code and improves maintainability without behavior change
|
GNU testsuite comparison: |
fix GNU fold-characters.sh test
Added a new FoldContext helper to keep width mode, whitespace handling, buffers, and counters together, letting process_utf8_line and process_non_utf8_line focus solely on folding logic.
Refactored folding helpers to use the shared context and route through the unified emit_output, resolving the Clippy too_many_arguments errors without altering behaviour.
#9127