-
-
Notifications
You must be signed in to change notification settings - Fork 957
Add support for filtering orphaned files with --owner orphan #1841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Extends the --owner filter to support finding files with no valid user/group by using the 'orphan' keyword. This provides equivalent functionality to find's -nouser and -nogroup flags. Examples: fd --owner orphan # equivalent to find -nouser fd --owner :orphan # equivalent to find -nogroup fd --owner orphan:orphan # both -nouser and -nogroup Implementation: - Added Check::Orphaned variant to owner filter - Extended OwnerFilter::from_string to recognize 'orphan' keyword - Updated OwnerFilter::matches to check if uid/gid maps to valid user/group - Added unit tests for orphan parsing
src/filter/owner.rs
Outdated
|
|
||
| self.uid.check(md.uid()) && self.gid.check(md.gid()) | ||
| let uid_match = match self.uid { | ||
| Check::Orphaned => User::from_uid(md.uid().into()).ok().flatten().is_none(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth using a cache, so we don't have to do a lookup for every single file. We could probably even use a Set of uids known not to be orphaned.
src/filter/owner.rs
Outdated
| Check::Equal(x) => v == *x, | ||
| Check::NotEq(x) => v != *x, | ||
| Check::Ignore => true, | ||
| Check::Orphaned => unreachable!("Orphaned check handled in OwnerFilter::matches"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like how the caller is responsible for handling this separately. It would be nice if we could just include a Set of allowed ids in the Orphaned vairant, but I think on some systems, the user ids could be dynamic, and reading from /etc/passwd isn't sufficient.
Another option could be to have the Orphaned variant include a function to check if an id exists, or even just a value indicating if it is for user ids or group ids.
src/filter/owner.rs
Outdated
| both_negate:"!4:!3" => Ok(OwnerFilter { uid: NotEq(4), gid: NotEq(3) }), | ||
| uid_not_gid:"6:!8" => Ok(OwnerFilter { uid: Equal(6), gid: NotEq(8) }), | ||
|
|
||
| orphan_uid: "orphan" => Ok(OwnerFilter { uid: Orphaned, gid: Ignore }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible that a user or group is actually named "orphan". I think it would probably be better to use a a different option, or maybe use something that isn't a valid username like "-".
src/cli.rs
Outdated
| /// Filter files by their user and/or group. | ||
| /// Format: [(user|uid)][:(group|gid)]. Either side is optional. | ||
| /// Precede either side with a '!' to exclude files instead. | ||
| /// Use 'orphan' to match files with no valid user/group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move forward with this, you'll also need to update the man page and CHANGELOG
Changes based on maintainer feedback: - Changed keyword from 'orphan' to '-' to avoid conflicts with actual usernames - Refactored Check::Orphan to include validator function (fn(u32) -> bool) - Simplified matches() to use check() uniformly - no special case handling - Added caching with LazyLock<Mutex<HashSet>> to avoid repeated uid/gid lookups - Updated documentation (man page and CHANGELOG) The new design stores the validation logic in the Orphan variant itself, making it self-contained and consistent with other Check variants. All checks now go through the generic check() method.
|
Addressed all feedback: changed keyword to |
|
bump @tmccombs |
|
@tmccombs I should have fixed it |
|
bump @tmccombs is there sth I should fix? |
src/filter/owner.rs
Outdated
| use std::sync::{LazyLock, Mutex}; | ||
|
|
||
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| static VALID_UIDS: LazyLock<Mutex<HashSet<u32>>> = LazyLock::new(|| Mutex::new(HashSet::new())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not be performant. I'd use a thread-local instead. You should also use something like a HashMap<u32, bool> so that you can cache both negative and positive results.
tmccombs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good. Just address @tavianator's comment.
- Replace LazyLock<Mutex<HashSet>> with thread_local RefCell<HashMap> - Cache both positive and negative lookup results - Avoids mutex contention in parallel traversal
|
@tavianator I should have addressed ur review in the last commit |
@tmccombs |
|
@tavianator bump. ty in advance |
Extends the --owner filter to support finding files with no valid
user/group by using the 'orphan' keyword. This provides equivalent
functionality to find's -nouser and -nogroup flags.
Examples:
fd --owner orphan # equivalent to find -nouser
fd --owner :orphan # equivalent to find -nogroup
fd --owner orphan:orphan # both -nouser and -nogroup
Implementation: