-
Notifications
You must be signed in to change notification settings - Fork 6
Redfs-6.8 with dlm_lock opcode for fuse #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redfs-6.8 with dlm_lock opcode for fuse #5
Conversation
9b1fb53
to
ce471a0
Compare
@hbirth Could you redirect the PR to redfs-ubuntu-noble-6.8.0-58.60? I would like to use that as the new base branch. First wanted to fix the remaining bug in that branch, but actually that bug is also present in redfs-6.8 - does not make a difference then. The redfs-ubuntu-noble-6.8.0-58.60 branch has the io-uring code that went upstream... |
9157f40
to
512a1c8
Compare
done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing punch_hole, for example in fuse_reverse_inval_inode() and fuse_direct_io(). As well as fuse_do_setattr() on truncate.
Well, punch hole in fuse_direct_io() is debatable, as we still write, just the pages are given up |
Note that fuse_reverse_inval_inode() takes a spin lock, invalidating the the dlm cache range before or after the lock is a bit racy. |
there was some temporary code in there, that I put in for testing ... will be fixed |
5719216
to
6174269
Compare
@bsbernd tested this with truncate and append ... I have not tested the punch hole in 'notify' fuse_reverse_inval_inode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some final nitpicks. I think we should pass the fuse inode everywhere instead of fuse_dlm_cache. Or rename fuse_dlm_lock to fuse_inode_dlm lock?
I think those are good ideas ... will do that (pass the inode), it makes it easier to read, I hope |
68e1891
to
25060e9
Compare
When having writeback cache enabled it is beneficial for data consistency to communicate to the FUSE server when the kernel prepares a page for caching. This lets the FUSE server react and lock the page. Additionally the kernel lets the FUSE server decide how much data it locks by the same call and keeps the given information in the dlm lock management. If the feature is not supported it will be disabled after first unsuccessful use. - Add DLM_LOCK fuse opcode - Add cache page lock caching for writeback cache functionality. This means sending out a FUSE call whenever the kernel prepares a page for writeback cache. The kernel will manage the cache so that it will keep track of already acquired locks. (except for the case that is documented in the code) - Use rb-trees for the management of the already 'locked' page ranges - Use rw_semaphore for synchronization in fuse_dlm_cache
…terface jira LE-4018 cve CVE-2025-38500 Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6 commit-author Eyal Birger <[email protected]> commit a90b2a1 collect_md property on xfrm interfaces can only be set on device creation, thus xfrmi_changelink() should fail when called on such interfaces. The check to enforce this was done only in the case where the xi was returned from xfrmi_locate() which doesn't look for the collect_md interface, and thus the validation was never reached. Calling changelink would thus errornously place the special interface xi in the xfrmi_net->xfrmi hash, but since it also exists in the xfrmi_net->collect_md_xfrmi pointer it would lead to a double free when the net namespace was taken down [1]. Change the check to use the xi from netdev_priv which is available earlier in the function to prevent changes in xfrm collect_md interfaces. [1] resulting oops: [ 8.516540] kernel BUG at net/core/dev.c:12029! [ 8.516552] Oops: invalid opcode: 0000 [DDNStorage#1] SMP NOPTI [ 8.516559] CPU: 0 UID: 0 PID: 12 Comm: kworker/u80:0 Not tainted 6.15.0-virtme DDNStorage#5 PREEMPT(voluntary) [ 8.516565] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 8.516569] Workqueue: netns cleanup_net [ 8.516579] RIP: 0010:unregister_netdevice_many_notify+0x101/0xab0 [ 8.516590] Code: 90 0f 0b 90 48 8b b0 78 01 00 00 48 8b 90 80 01 00 00 48 89 56 08 48 89 32 4c 89 80 78 01 00 00 48 89 b8 80 01 00 00 eb ac 90 <0f> 0b 48 8b 45 00 4c 8d a0 88 fe ff ff 48 39 c5 74 5c 41 80 bc 24 [ 8.516593] RSP: 0018:ffffa93b8006bd30 EFLAGS: 00010206 [ 8.516598] RAX: ffff98fe4226e000 RBX: ffffa93b8006bd58 RCX: ffffa93b8006bc60 [ 8.516601] RDX: 0000000000000004 RSI: 0000000000000000 RDI: dead000000000122 [ 8.516603] RBP: ffffa93b8006bdd8 R08: dead000000000100 R09: ffff98fe4133c100 [ 8.516605] R10: 0000000000000000 R11: 00000000000003d2 R12: ffffa93b8006be00 [ 8.516608] R13: ffffffff96c1a510 R14: ffffffff96c1a510 R15: ffffa93b8006be00 [ 8.516615] FS: 0000000000000000(0000) GS:ffff98fee73b7000(0000) knlGS:0000000000000000 [ 8.516619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.516622] CR2: 00007fcd2abd0700 CR3: 000000003aa40000 CR4: 0000000000752ef0 [ 8.516625] PKRU: 55555554 [ 8.516627] Call Trace: [ 8.516632] <TASK> [ 8.516635] ? rtnl_is_locked+0x15/0x20 [ 8.516641] ? unregister_netdevice_queue+0x29/0xf0 [ 8.516650] ops_undo_list+0x1f2/0x220 [ 8.516659] cleanup_net+0x1ad/0x2e0 [ 8.516664] process_one_work+0x160/0x380 [ 8.516673] worker_thread+0x2aa/0x3c0 [ 8.516679] ? __pfx_worker_thread+0x10/0x10 [ 8.516686] kthread+0xfb/0x200 [ 8.516690] ? __pfx_kthread+0x10/0x10 [ 8.516693] ? __pfx_kthread+0x10/0x10 [ 8.516697] ret_from_fork+0x82/0xf0 [ 8.516705] ? __pfx_kthread+0x10/0x10 [ 8.516709] ret_from_fork_asm+0x1a/0x30 [ 8.516718] </TASK> Fixes: abc340b ("xfrm: interface: support collect metadata mode") Reported-by: Lonial Con <[email protected]> Signed-off-by: Eyal Birger <[email protected]> Signed-off-by: Steffen Klassert <[email protected]> (cherry picked from commit a90b2a1) Signed-off-by: Jonathan Maple <[email protected]>
Added DLM_LOCK opcode to fuse for signaling to fuse server that the kernel has mapped a page of a file in cache (only active when writeback cache is enabled)