Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@naoNao89
Copy link
Contributor

@naoNao89 naoNao89 commented Nov 1, 2025

Fixes #6078

page-aligned buffers + smarter O_DIRECT handling. Theory says 5x fewer syscalls. 🗿

Checklist:

  • dd's first benchmark suite to verify
  • merge bench to this branch for check report pref

@naoNao89 naoNao89 force-pushed the feature/o-direct-buffer-alignment branch 3 times, most recently from 9f131bd to fecebff Compare November 1, 2025 06:05
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 1, 2025

CodSpeed Performance Report

Merging #9104 will not alter performance

Comparing naoNao89:feature/o-direct-buffer-alignment (a713108) with main (1eea517)

Summary

✅ 136 untouched
⏩ 24 skipped1

Footnotes

  1. 24 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Implement page-aligned buffer allocation and optimize O_DIRECT flag
handling to match GNU dd behavior.

Key changes:
- Add allocate_aligned_buffer() for page-aligned memory allocation
- Update buffer allocation to use aligned buffers
- Modify handle_o_direct_write() to only remove O_DIRECT for partial blocks
- Add Output::write_with_o_direct_handling() for proper O_DIRECT handling
- Add comprehensive unit and integration tests

Fixes uutils#6078
@naoNao89 naoNao89 force-pushed the feature/o-direct-buffer-alignment branch from fecebff to 2560240 Compare November 1, 2025 06:13
…IRECT on ARM

O_DIRECT requires page-aligned buffers and writes. The conv=sync flag pads
output to block size, which may not be page-aligned, causing EINVAL errors
on ARM systems. The core O_DIRECT functionality is already well-tested by:
- test_o_direct_with_aligned_buffer_full_blocks
- test_o_direct_with_partial_final_block
- test_o_direct_various_block_sizes
@github-actions
Copy link

github-actions bot commented Nov 1, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

@naoNao89 i do value your contribution and terrific work. But please, also focus on finishing the existing work instead of starting new work

#9088 => red CI
#9094 => red CI
#9026 => WIP
#8949 => still draft
etc :)

@naoNao89
Copy link
Contributor Author

naoNao89 commented Nov 1, 2025

I need more dopamine when stuck on a bug, so new PRs might be good :))

@cakebaker cakebaker changed the title optimize O_DIRECT buffer alignment to reduce syscall overhead dd: optimize O_DIRECT buffer alignment to reduce syscall overhead Nov 1, 2025
@github-actions
Copy link

github-actions bot commented Nov 2, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 closed this by deleting the head repository Nov 6, 2025
@naoNao89 naoNao89 reopened this Nov 7, 2025
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

Copy link
Contributor

@cre4ture cre4ture left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

/// This function allocates a `Vec<u8>` with proper alignment to support O_DIRECT
/// without triggering EINVAL errors.
#[cfg(any(target_os = "linux", target_os = "android"))]
fn allocate_aligned_buffer(size: usize) -> Vec<u8> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sylvestre is this something we could move to a more central location? Or is this the only place where we need aligned memory allocations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which programs will use it ? thanks

sylvestre and others added 2 commits December 29, 2025 14:28
- Remove dead code: non-Linux stub for handle_o_direct_write
  The stub was unreachable since write_with_o_direct_handling already
  has a non-Linux stub that doesn't call this helper function.

- Fix clippy::ptr-as-ptr lint error
  Replace unsafe `as *mut u8` cast with safer `.cast::<u8>()` method
  in allocate_aligned_buffer function.

Addresses review comments and CI/CD failures in PR uutils#9104.
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Removed redundant buffer initialization in allocate_aligned_buffer that
was causing performance regression, especially for large block sizes.

- Eliminated O(n) write_bytes overhead that scaled with buffer size
- Fixes 29.36% regression for 1M blocks and 6.22% for 64K blocks
- Buffer is correctly filled during copy operations, making pre-init redundant
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dd fails on gnu test tests/dd/direct.sh - dynamic removal of O_DIRECT missing

3 participants