-
Notifications
You must be signed in to change notification settings - Fork 57.9k
audit: fixed typo in comment #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
+1
−1
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
scosu
pushed a commit
to scosu/linux-sched
that referenced
this pull request
Jun 9, 2013
Lockdep found an inconsistent lock state when rcu is processing delayed work in softirq. Currently, kernel is using spin_lock/spin_unlock to protect proc_inum_ida, but proc_free_inum is called by rcu in softirq context. Use spin_lock_bh/spin_unlock_bh fix following lockdep warning. ================================= [ INFO: inconsistent lock state ] 3.7.0 torvalds#36 Not tainted --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (proc_inum_lock){+.?...}, at: proc_free_inum+0x1c/0x50 {SOFTIRQ-ON-W} state was registered at: __lock_acquire+0x8ae/0xca0 lock_acquire+0x199/0x200 _raw_spin_lock+0x41/0x50 proc_alloc_inum+0x4c/0xd0 alloc_mnt_ns+0x49/0xc0 create_mnt_ns+0x25/0x70 mnt_init+0x161/0x1c7 vfs_caches_init+0x107/0x11a start_kernel+0x348/0x38c x86_64_start_reservations+0x131/0x136 x86_64_start_kernel+0x103/0x112 irq event stamp: 2993422 hardirqs last enabled at (2993422): _raw_spin_unlock_irqrestore+0x55/0x80 hardirqs last disabled at (2993421): _raw_spin_lock_irqsave+0x29/0x70 softirqs last enabled at (2993394): _local_bh_enable+0x13/0x20 softirqs last disabled at (2993395): call_softirq+0x1c/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(proc_inum_lock); <Interrupt> lock(proc_inum_lock); *** DEADLOCK *** no locks held by swapper/1/0. stack backtrace: Pid: 0, comm: swapper/1 Not tainted 3.7.0 torvalds#36 Call Trace: <IRQ> [<ffffffff810a40f1>] ? vprintk_emit+0x471/0x510 print_usage_bug+0x2a5/0x2c0 mark_lock+0x33b/0x5e0 __lock_acquire+0x813/0xca0 lock_acquire+0x199/0x200 _raw_spin_lock+0x41/0x50 proc_free_inum+0x1c/0x50 free_pid_ns+0x1c/0x50 put_pid_ns+0x2e/0x50 put_pid+0x4a/0x60 delayed_put_pid+0x12/0x20 rcu_process_callbacks+0x462/0x790 __do_softirq+0x1b4/0x3b0 call_softirq+0x1c/0x30 do_softirq+0x59/0xd0 irq_exit+0x54/0xd0 smp_apic_timer_interrupt+0x95/0xa3 apic_timer_interrupt+0x72/0x80 cpuidle_enter_tk+0x10/0x20 cpuidle_enter_state+0x17/0x50 cpuidle_idle_call+0x287/0x520 cpu_idle+0xba/0x130 start_secondary+0x2b3/0x2bc Signed-off-by: Xiaotian Feng <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
hardkernel
referenced
this pull request
in hardkernel/linux
Jun 23, 2013
commit 4523e14 upstream. hugetlb_reserve_pages() can be used for either normal file-backed hugetlbfs mappings, or MAP_HUGETLB. In the MAP_HUGETLB, semi-anonymous mode, there is not a VMA around. The new call to resv_map_put() assumed that there was, and resulted in a NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: vma_resv_map+0x9/0x30 PGD 141453067 PUD 1421e1067 PMD 0 Oops: 0000 [#1] PREEMPT SMP ... Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ #36 RIP: vma_resv_map+0x9/0x30 ... Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0) Call Trace: resv_map_put+0xe/0x40 hugetlb_reserve_pages+0xa6/0x1d0 hugetlb_file_setup+0x102/0x2c0 newseg+0x115/0x360 ipcget+0x1ce/0x310 sys_shmget+0x5a/0x60 system_call_fastpath+0x16/0x1b This was reported by Dave Jones, but was reproducible with the libhugetlbfs test cases, so shame on me for not running them in the first place. With this, the oops is gone, and the output of libhugetlbfs's run_tests.py is identical to plain 3.4 again. [ Marked for stable, since this was introduced by commit c50ac05 ("hugetlb: fix resv_map leak in error path") which was also marked for stable ] Reported-by: Dave Jones <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
torvalds
pushed a commit
that referenced
this pull request
Jul 10, 2013
Two of the x25 ioctl cases have error paths that break out of the function without unlocking the socket, leading to this warning: ================================================ [ BUG: lock held when returning to user space! ] 3.10.0-rc7+ #36 Not tainted ------------------------------------------------ trinity-child2/31407 is leaving the kernel with locks still held! 1 lock held by trinity-child2/31407: #0: (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]>
tositrino
pushed a commit
to tositrino/linux
that referenced
this pull request
Jul 29, 2013
[ Upstream commit 4ccb93c ] Two of the x25 ioctl cases have error paths that break out of the function without unlocking the socket, leading to this warning: ================================================ [ BUG: lock held when returning to user space! ] 3.10.0-rc7+ torvalds#36 Not tainted ------------------------------------------------ trinity-child2/31407 is leaving the kernel with locks still held! 1 lock held by trinity-child2/31407: #0: (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
heftig
referenced
this pull request
in zen-kernel/zen-kernel
Jul 29, 2013
[ Upstream commit 4ccb93c ] Two of the x25 ioctl cases have error paths that break out of the function without unlocking the socket, leading to this warning: ================================================ [ BUG: lock held when returning to user space! ] 3.10.0-rc7+ #36 Not tainted ------------------------------------------------ trinity-child2/31407 is leaving the kernel with locks still held! 1 lock held by trinity-child2/31407: #0: (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
baerwolf
pushed a commit
to baerwolf/linux-stephan
that referenced
this pull request
Aug 3, 2013
[ Upstream commit 4ccb93c ] Two of the x25 ioctl cases have error paths that break out of the function without unlocking the socket, leading to this warning: ================================================ [ BUG: lock held when returning to user space! ] 3.10.0-rc7+ torvalds#36 Not tainted ------------------------------------------------ trinity-child2/31407 is leaving the kernel with locks still held! 1 lock held by trinity-child2/31407: #0: (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
mdrjr
referenced
this pull request
in hardkernel/linux
Aug 6, 2013
[ Upstream commit 4ccb93c ] Two of the x25 ioctl cases have error paths that break out of the function without unlocking the socket, leading to this warning: ================================================ [ BUG: lock held when returning to user space! ] 3.10.0-rc7+ #36 Not tainted ------------------------------------------------ trinity-child2/31407 is leaving the kernel with locks still held! 1 lock held by trinity-child2/31407: #0: (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Oct 14, 2013
As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.644008] smpboot: Booting Node 1, Processors: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 OK [ 1.245006] smpboot: Booting Node 2, Processors: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 OK [ 1.864005] smpboot: Booting Node 3, Processors: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 OK [ 2.489005] smpboot: Booting Node 4, Processors: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 OK [ 3.093005] smpboot: Booting Node 5, Processors: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 OK [ 3.698005] smpboot: Booting Node 6, Processors: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 OK [ 4.304005] smpboot: Booting Node 7, Processors: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <[email protected]> Cc: Libin <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Oct 14, 2013
Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 [ 0.603005] .... node #1, CPUs: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 [ 1.200005] .... node #2, CPUs: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 [ 1.796005] .... node #3, CPUs: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 [ 2.393005] .... node #4, CPUs: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 [ 2.996005] .... node #5, CPUs: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 [ 3.600005] .... node torvalds#6, CPUs: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 [ 4.202005] .... node torvalds#7, CPUs: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 [ 4.811005] .... node torvalds#8, CPUs: torvalds#64 torvalds#65 torvalds#66 torvalds#67 torvalds#68 torvalds#69 #70 torvalds#71 [ 5.421006] .... node torvalds#9, CPUs: torvalds#72 torvalds#73 torvalds#74 torvalds#75 torvalds#76 torvalds#77 torvalds#78 torvalds#79 [ 6.032005] .... node torvalds#10, CPUs: torvalds#80 torvalds#81 torvalds#82 torvalds#83 torvalds#84 torvalds#85 torvalds#86 torvalds#87 [ 6.648006] .... node torvalds#11, CPUs: torvalds#88 torvalds#89 torvalds#90 torvalds#91 torvalds#92 torvalds#93 torvalds#94 torvalds#95 [ 7.262005] .... node torvalds#12, CPUs: torvalds#96 torvalds#97 torvalds#98 torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103 [ 7.865005] .... node torvalds#13, CPUs: torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111 [ 8.466005] .... node torvalds#14, CPUs: torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119 [ 9.073006] .... node torvalds#15, CPUs: torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
kees
pushed a commit
to kees/linux
that referenced
this pull request
Nov 8, 2013
BugLink: http://bugs.launchpad.net/bugs/1214984 [ Upstream commit 4ccb93c ] Two of the x25 ioctl cases have error paths that break out of the function without unlocking the socket, leading to this warning: ================================================ [ BUG: lock held when returning to user space! ] 3.10.0-rc7+ torvalds#36 Not tainted ------------------------------------------------ trinity-child2/31407 is leaving the kernel with locks still held! 1 lock held by trinity-child2/31407: #0: (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Brad Figg <[email protected]>
shr-buildhost
pushed a commit
to shr-distribution/linux
that referenced
this pull request
Nov 30, 2013
commit 4523e14 upstream. hugetlb_reserve_pages() can be used for either normal file-backed hugetlbfs mappings, or MAP_HUGETLB. In the MAP_HUGETLB, semi-anonymous mode, there is not a VMA around. The new call to resv_map_put() assumed that there was, and resulted in a NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: vma_resv_map+0x9/0x30 PGD 141453067 PUD 1421e1067 PMD 0 Oops: 0000 [#1] PREEMPT SMP ... Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ torvalds#36 RIP: vma_resv_map+0x9/0x30 ... Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0) Call Trace: resv_map_put+0xe/0x40 hugetlb_reserve_pages+0xa6/0x1d0 hugetlb_file_setup+0x102/0x2c0 newseg+0x115/0x360 ipcget+0x1ce/0x310 sys_shmget+0x5a/0x60 system_call_fastpath+0x16/0x1b This was reported by Dave Jones, but was reproducible with the libhugetlbfs test cases, so shame on me for not running them in the first place. With this, the oops is gone, and the output of libhugetlbfs's run_tests.py is identical to plain 3.4 again. [ Marked for stable, since this was introduced by commit c50ac05 ("hugetlb: fix resv_map leak in error path") which was also marked for stable ] Reported-by: Dave Jones <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Jan 14, 2014
Due to unknown hw issue so far, Merrifield is unable to enable HS200 support. This patch adds quirk to avoid SDHCI to initialize with error below: [ 53.850132] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.12.0-rc6-00037-g3d7c8d9-dirty torvalds#36 [ 53.850150] Hardware name: Intel Corporation Merrifield/SALT BAY, BIOS 397 2013.09.12:11.51.40 [ 53.850167] 00000000 00000000 ee409e48 c18816d2 00000000 ee409e78 c123e254 c1acc9b0 [ 53.850227] 00000000 00000000 c1b14148 000003de c16c03bf c16c03bf ee75b480 ed97c54c [ 53.850282] ee75b480 ee409e88 c123e292 00000009 00000000 ee409ef8 c16c03bf c1207fac [ 53.850339] Call Trace: [ 53.850376] [<c18816d2>] dump_stack+0x4b/0x79 [ 53.850408] [<c123e254>] warn_slowpath_common+0x84/0xa0 [ 53.850436] [<c16c03bf>] ? sdhci_send_command+0xb4f/0xc50 [ 53.850462] [<c16c03bf>] ? sdhci_send_command+0xb4f/0xc50 [ 53.850490] [<c123e292>] warn_slowpath_null+0x22/0x30 [ 53.850516] [<c16c03bf>] sdhci_send_command+0xb4f/0xc50 [ 53.850545] [<c1207fac>] ? native_sched_clock+0x2c/0xb0 [ 53.850575] [<c14c1f93>] ? delay_tsc+0x73/0xb0 [ 53.850601] [<c14c1ebe>] ? __const_udelay+0x1e/0x20 [ 53.850626] [<c16bdeb3>] ? sdhci_reset+0x93/0x190 [ 53.850654] [<c16c05b0>] sdhci_finish_data+0xf0/0x2e0 [ 53.850683] [<c16c130f>] sdhci_irq+0x31f/0x930 [ 53.850713] [<c12cb080>] ? __buffer_unlock_commit+0x10/0x20 [ 53.850740] [<c12cbcd7>] ? trace_buffer_unlock_commit+0x37/0x50 [ 53.850773] [<c1288f3c>] handle_irq_event_percpu+0x5c/0x220 [ 53.850800] [<c128bc96>] ? handle_fasteoi_irq+0x16/0xd0 [ 53.850827] [<c128913a>] handle_irq_event+0x3a/0x60 [ 53.850852] [<c128bc80>] ? unmask_irq+0x30/0x30 [ 53.850878] [<c128bcce>] handle_fasteoi_irq+0x4e/0xd0 [ 53.850895] <IRQ> [<c1890b52>] ? do_IRQ+0x42/0xb0 [ 53.850943] [<c1890a31>] ? common_interrupt+0x31/0x38 [ 53.850973] [<c12b00d8>] ? cgroup_mkdir+0x4e8/0x580 [ 53.851001] [<c1208d32>] ? default_idle+0x22/0xf0 [ 53.851029] [<c1209576>] ? arch_cpu_idle+0x26/0x30 [ 53.851054] [<c1288505>] ? cpu_startup_entry+0x65/0x240 [ 53.851082] [<c18793d5>] ? rest_init+0xb5/0xc0 [ 53.851108] [<c1879320>] ? __read_lock_failed+0x18/0x18 [ 53.851138] [<c1bf6a15>] ? start_kernel+0x31b/0x321 [ 53.851164] [<c1bf652f>] ? repair_env_string+0x51/0x51 [ 53.851190] [<c1bf6363>] ? i386_start_kernel+0x139/0x13c [ 53.851209] ---[ end trace 92777f5fe48d33f2 ]--- [ 53.853449] mmcblk0: error -84 transferring data, sector 11142162, nr 304, cmd response 0x0, card status 0x0 [ 53.853476] mmcblk0: retrying using single block read [ 55.937863] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock [ 56.207951] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock [ 66.228785] mmc0: Timeout waiting for hardware interrupt. [ 66.230855] ------------[ cut here ]------------ Signed-off-by: David Cohen <[email protected]> Reviewed-by: Chuanxiao Dong <[email protected]> Acked-by: Dong Aisheng <[email protected]> Cc: stable <[email protected]> # [3.13] Signed-off-by: Chris Ball <[email protected]>
hardkernel
referenced
this pull request
in hardkernel/linux
Jan 16, 2014
The irq data for rng module defined in hwmod data previously missed the OMAP_INTC_START relative offset, so the interrupt number is probably misconfigured during the DT node addition adjusting for this OMAP_INTC_START. Interrupt #36 is associated with a watchdog timer, so fix the rng module's interrupt to the appropriate interrupt #52. Signed-off-by: Suman Anna <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
heftig
referenced
this pull request
in zen-kernel/zen-kernel
Feb 6, 2014
commit 390145f upstream. Due to unknown hw issue so far, Merrifield is unable to enable HS200 support. This patch adds quirk to avoid SDHCI to initialize with error below: [ 53.850132] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.12.0-rc6-00037-g3d7c8d9-dirty #36 [ 53.850150] Hardware name: Intel Corporation Merrifield/SALT BAY, BIOS 397 2013.09.12:11.51.40 [ 53.850167] 00000000 00000000 ee409e48 c18816d2 00000000 ee409e78 c123e254 c1acc9b0 [ 53.850227] 00000000 00000000 c1b14148 000003de c16c03bf c16c03bf ee75b480 ed97c54c [ 53.850282] ee75b480 ee409e88 c123e292 00000009 00000000 ee409ef8 c16c03bf c1207fac [ 53.850339] Call Trace: [ 53.850376] [<c18816d2>] dump_stack+0x4b/0x79 [ 53.850408] [<c123e254>] warn_slowpath_common+0x84/0xa0 [ 53.850436] [<c16c03bf>] ? sdhci_send_command+0xb4f/0xc50 [ 53.850462] [<c16c03bf>] ? sdhci_send_command+0xb4f/0xc50 [ 53.850490] [<c123e292>] warn_slowpath_null+0x22/0x30 [ 53.850516] [<c16c03bf>] sdhci_send_command+0xb4f/0xc50 [ 53.850545] [<c1207fac>] ? native_sched_clock+0x2c/0xb0 [ 53.850575] [<c14c1f93>] ? delay_tsc+0x73/0xb0 [ 53.850601] [<c14c1ebe>] ? __const_udelay+0x1e/0x20 [ 53.850626] [<c16bdeb3>] ? sdhci_reset+0x93/0x190 [ 53.850654] [<c16c05b0>] sdhci_finish_data+0xf0/0x2e0 [ 53.850683] [<c16c130f>] sdhci_irq+0x31f/0x930 [ 53.850713] [<c12cb080>] ? __buffer_unlock_commit+0x10/0x20 [ 53.850740] [<c12cbcd7>] ? trace_buffer_unlock_commit+0x37/0x50 [ 53.850773] [<c1288f3c>] handle_irq_event_percpu+0x5c/0x220 [ 53.850800] [<c128bc96>] ? handle_fasteoi_irq+0x16/0xd0 [ 53.850827] [<c128913a>] handle_irq_event+0x3a/0x60 [ 53.850852] [<c128bc80>] ? unmask_irq+0x30/0x30 [ 53.850878] [<c128bcce>] handle_fasteoi_irq+0x4e/0xd0 [ 53.850895] <IRQ> [<c1890b52>] ? do_IRQ+0x42/0xb0 [ 53.850943] [<c1890a31>] ? common_interrupt+0x31/0x38 [ 53.850973] [<c12b00d8>] ? cgroup_mkdir+0x4e8/0x580 [ 53.851001] [<c1208d32>] ? default_idle+0x22/0xf0 [ 53.851029] [<c1209576>] ? arch_cpu_idle+0x26/0x30 [ 53.851054] [<c1288505>] ? cpu_startup_entry+0x65/0x240 [ 53.851082] [<c18793d5>] ? rest_init+0xb5/0xc0 [ 53.851108] [<c1879320>] ? __read_lock_failed+0x18/0x18 [ 53.851138] [<c1bf6a15>] ? start_kernel+0x31b/0x321 [ 53.851164] [<c1bf652f>] ? repair_env_string+0x51/0x51 [ 53.851190] [<c1bf6363>] ? i386_start_kernel+0x139/0x13c [ 53.851209] ---[ end trace 92777f5fe48d33f2 ]--- [ 53.853449] mmcblk0: error -84 transferring data, sector 11142162, nr 304, cmd response 0x0, card status 0x0 [ 53.853476] mmcblk0: retrying using single block read [ 55.937863] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock [ 56.207951] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock [ 66.228785] mmc0: Timeout waiting for hardware interrupt. [ 66.230855] ------------[ cut here ]------------ Signed-off-by: David Cohen <[email protected]> Reviewed-by: Chuanxiao Dong <[email protected]> Acked-by: Dong Aisheng <[email protected]> Signed-off-by: Chris Ball <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Mar 19, 2014
WARNING: storage class should be at the beginning of the declaration torvalds#25: FILE: lib/smp_processor_id.c:10: +notrace static unsigned int check_preemption_disabled(const char *what1, WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ... torvalds#36: FILE: lib/smp_processor_id.c:42: + printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n", ERROR: space required after that ',' (ctx:VxV) torvalds#46: FILE: lib/smp_processor_id.c:56: + return check_preemption_disabled("smp_processor_id",""); ^ total: 1 errors, 2 warnings, 36 lines checked ./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Mar 20, 2014
WARNING: storage class should be at the beginning of the declaration torvalds#25: FILE: lib/smp_processor_id.c:10: +notrace static unsigned int check_preemption_disabled(const char *what1, WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ... torvalds#36: FILE: lib/smp_processor_id.c:42: + printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n", ERROR: space required after that ',' (ctx:VxV) torvalds#46: FILE: lib/smp_processor_id.c:56: + return check_preemption_disabled("smp_processor_id",""); ^ total: 1 errors, 2 warnings, 36 lines checked ./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
lunn
pushed a commit
to lunn/linux
that referenced
this pull request
Mar 22, 2014
WARNING: storage class should be at the beginning of the declaration torvalds#25: FILE: lib/smp_processor_id.c:10: +notrace static unsigned int check_preemption_disabled(const char *what1, WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ... torvalds#36: FILE: lib/smp_processor_id.c:42: + printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n", ERROR: space required after that ',' (ctx:VxV) torvalds#46: FILE: lib/smp_processor_id.c:56: + return check_preemption_disabled("smp_processor_id",""); ^ total: 1 errors, 2 warnings, 36 lines checked ./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
zeitgeist87
pushed a commit
to zeitgeist87/linux
that referenced
this pull request
Mar 26, 2014
WARNING: storage class should be at the beginning of the declaration torvalds#25: FILE: lib/smp_processor_id.c:10: +notrace static unsigned int check_preemption_disabled(const char *what1, WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ... torvalds#36: FILE: lib/smp_processor_id.c:42: + printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n", ERROR: space required after that ',' (ctx:VxV) torvalds#46: FILE: lib/smp_processor_id.c:56: + return check_preemption_disabled("smp_processor_id",""); ^ total: 1 errors, 2 warnings, 36 lines checked ./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Mar 31, 2014
WARNING: storage class should be at the beginning of the declaration torvalds#25: FILE: lib/smp_processor_id.c:10: +notrace static unsigned int check_preemption_disabled(const char *what1, WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ... torvalds#36: FILE: lib/smp_processor_id.c:42: + printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n", ERROR: space required after that ',' (ctx:VxV) torvalds#46: FILE: lib/smp_processor_id.c:56: + return check_preemption_disabled("smp_processor_id",""); ^ total: 1 errors, 2 warnings, 36 lines checked ./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
ddstreet
pushed a commit
to ddstreet/linux
that referenced
this pull request
Apr 8, 2014
WARNING: storage class should be at the beginning of the declaration torvalds#25: FILE: lib/smp_processor_id.c:10: +notrace static unsigned int check_preemption_disabled(const char *what1, WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ... torvalds#36: FILE: lib/smp_processor_id.c:42: + printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n", ERROR: space required after that ',' (ctx:VxV) torvalds#46: FILE: lib/smp_processor_id.c:56: + return check_preemption_disabled("smp_processor_id",""); ^ total: 1 errors, 2 warnings, 36 lines checked ./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
gregnietsky
pushed a commit
to Distrotech/linux
that referenced
this pull request
Apr 9, 2014
commit 4523e14 upstream hugetlb_reserve_pages() can be used for either normal file-backed hugetlbfs mappings, or MAP_HUGETLB. In the MAP_HUGETLB, semi-anonymous mode, there is not a VMA around. The new call to resv_map_put() assumed that there was, and resulted in a NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: vma_resv_map+0x9/0x30 PGD 141453067 PUD 1421e1067 PMD 0 Oops: 0000 [#1] PREEMPT SMP ... Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ torvalds#36 RIP: vma_resv_map+0x9/0x30 ... Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0) Call Trace: resv_map_put+0xe/0x40 hugetlb_reserve_pages+0xa6/0x1d0 hugetlb_file_setup+0x102/0x2c0 newseg+0x115/0x360 ipcget+0x1ce/0x310 sys_shmget+0x5a/0x60 system_call_fastpath+0x16/0x1b This was reported by Dave Jones, but was reproducible with the libhugetlbfs test cases, so shame on me for not running them in the first place. With this, the oops is gone, and the output of libhugetlbfs's run_tests.py is identical to plain 3.4 again. [ Marked for stable, since this was introduced by commit c50ac05 ("hugetlb: fix resv_map leak in error path") which was also marked for stable ] Reported-by: Dave Jones <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Andrew Morton <[email protected]> Cc: <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> [dannf: backported to Debian's 2.6.32] Signed-off-by: Willy Tarreau <[email protected]>
gregnietsky
pushed a commit
to Distrotech/linux
that referenced
this pull request
Apr 9, 2014
commit 4523e14 upstream. hugetlb_reserve_pages() can be used for either normal file-backed hugetlbfs mappings, or MAP_HUGETLB. In the MAP_HUGETLB, semi-anonymous mode, there is not a VMA around. The new call to resv_map_put() assumed that there was, and resulted in a NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: vma_resv_map+0x9/0x30 PGD 141453067 PUD 1421e1067 PMD 0 Oops: 0000 [#1] PREEMPT SMP ... Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ torvalds#36 RIP: vma_resv_map+0x9/0x30 ... Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0) Call Trace: resv_map_put+0xe/0x40 hugetlb_reserve_pages+0xa6/0x1d0 hugetlb_file_setup+0x102/0x2c0 newseg+0x115/0x360 ipcget+0x1ce/0x310 sys_shmget+0x5a/0x60 system_call_fastpath+0x16/0x1b This was reported by Dave Jones, but was reproducible with the libhugetlbfs test cases, so shame on me for not running them in the first place. With this, the oops is gone, and the output of libhugetlbfs's run_tests.py is identical to plain 3.4 again. [ Marked for stable, since this was introduced by commit c50ac05 ("hugetlb: fix resv_map leak in error path") which was also marked for stable ] Reported-by: Dave Jones <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Paul Gortmaker <[email protected]>
tom3q
pushed a commit
to tom3q/linux
that referenced
this pull request
Apr 13, 2014
WARNING: storage class should be at the beginning of the declaration torvalds#25: FILE: lib/smp_processor_id.c:10: +notrace static unsigned int check_preemption_disabled(const char *what1, WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ... torvalds#36: FILE: lib/smp_processor_id.c:42: + printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n", ERROR: space required after that ',' (ctx:VxV) torvalds#46: FILE: lib/smp_processor_id.c:56: + return check_preemption_disabled("smp_processor_id",""); ^ total: 1 errors, 2 warnings, 36 lines checked ./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
jonsmirl
pushed a commit
to jonsmirl/lpc31xx
that referenced
this pull request
Jun 20, 2014
Reserve memory only when needed (mark 2)
torvalds
pushed a commit
that referenced
this pull request
Oct 29, 2014
This patch wires up the new syscall sys_bpf() on powerpc. Passes the tests in samples/bpf: #0 add+sub+mul OK #1 unreachable OK #2 unreachable2 OK #3 out of range jump OK #4 out of range jump2 OK #5 test1 ld_imm64 OK #6 test2 ld_imm64 OK #7 test3 ld_imm64 OK #8 test4 ld_imm64 OK #9 test5 ld_imm64 OK #10 no bpf_exit OK #11 loop (back-edge) OK #12 loop2 (back-edge) OK #13 conditional loop OK #14 read uninitialized register OK #15 read invalid register OK #16 program doesn't init R0 before exit OK #17 stack out of bounds OK #18 invalid call insn1 OK #19 invalid call insn2 OK #20 invalid function call OK #21 uninitialized stack1 OK #22 uninitialized stack2 OK #23 check valid spill/fill OK #24 check corrupted spill/fill OK #25 invalid src register in STX OK #26 invalid dst register in STX OK #27 invalid dst register in ST OK #28 invalid src register in LDX OK #29 invalid dst register in LDX OK #30 junk insn OK #31 junk insn2 OK #32 junk insn3 OK #33 junk insn4 OK #34 junk insn5 OK #35 misaligned read from stack OK #36 invalid map_fd for function call OK #37 don't check return value before access OK #38 access memory with incorrect alignment OK #39 sometimes access memory with incorrect alignment OK #40 jump test 1 OK #41 jump test 2 OK #42 jump test 3 OK #43 jump test 4 OK Signed-off-by: Pranith Kumar <[email protected]> [mpe: test using samples/bpf] Signed-off-by: Michael Ellerman <[email protected]>
dabrace
referenced
this pull request
in dabrace/linux
Nov 10, 2014
This patch wires up the new syscall sys_bpf() on powerpc. Passes the tests in samples/bpf: #0 add+sub+mul OK #1 unreachable OK #2 unreachable2 OK #3 out of range jump OK #4 out of range jump2 OK #5 test1 ld_imm64 OK #6 test2 ld_imm64 OK #7 test3 ld_imm64 OK #8 test4 ld_imm64 OK #9 test5 ld_imm64 OK #10 no bpf_exit OK #11 loop (back-edge) OK #12 loop2 (back-edge) OK #13 conditional loop OK #14 read uninitialized register OK #15 read invalid register OK #16 program doesn't init R0 before exit OK #17 stack out of bounds OK #18 invalid call insn1 OK #19 invalid call insn2 OK #20 invalid function call OK #21 uninitialized stack1 OK #22 uninitialized stack2 OK #23 check valid spill/fill OK #24 check corrupted spill/fill OK #25 invalid src register in STX OK #26 invalid dst register in STX OK #27 invalid dst register in ST OK #28 invalid src register in LDX OK #29 invalid dst register in LDX OK #30 junk insn OK #31 junk insn2 OK #32 junk insn3 OK #33 junk insn4 OK #34 junk insn5 OK #35 misaligned read from stack OK #36 invalid map_fd for function call OK #37 don't check return value before access OK #38 access memory with incorrect alignment OK #39 sometimes access memory with incorrect alignment OK #40 jump test 1 OK #41 jump test 2 OK #42 jump test 3 OK #43 jump test 4 OK Signed-off-by: Pranith Kumar <[email protected]> [mpe: test using samples/bpf] Signed-off-by: Michael Ellerman <[email protected]>
prahal
pushed a commit
to prahal/linux
that referenced
this pull request
May 5, 2015
…hotplug notif. This is to fix an issue of sleeping in atomic context when processing hotplug notifications in Exynos MCT(Multi-Core Timer). The issue was reproducible on Exynos 3250 (Rinato board) and Exynos 5420 (Arndale Octa board). Whilst testing cpu hotplug events on kernel configured with DEBUG_PREEMPT and DEBUG_ATOMIC_SLEEP we get following BUG message, caused by calling request_irq() and free_irq() in the context of hotplug notification (which is in this case atomic context). root@target:~# echo 0 > /sys/devices/system/cpu/cpu1/online [ 25.157867] IRQ18 no longer affine to CPU1 ... [ 25.158445] CPU1: shutdown root@target:~# echo 1 > /sys/devices/system/cpu/cpu1/online [ 40.785859] CPU1: Software reset [ 40.786660] BUG: sleeping function called from invalid context at mm/slub.c:1241 [ 40.786668] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1 [ 40.786678] Preemption disabled at:[< (null)>] (null) [ 40.786681] [ 40.786692] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc4-00024-g7dca860 torvalds#36 [ 40.786698] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 40.786728] [<c0014a00>] (unwind_backtrace) from [<c0011980>] (show_stack+0x10/0x14) [ 40.786747] [<c0011980>] (show_stack) from [<c0449ba0>] (dump_stack+0x70/0xbc) [ 40.786767] [<c0449ba0>] (dump_stack) from [<c00c6124>] (kmem_cache_alloc+0xd8/0x170) [ 40.786785] [<c00c6124>] (kmem_cache_alloc) from [<c005d6f8>] (request_threaded_irq+0x64/0x128) [ 40.786804] [<c005d6f8>] (request_threaded_irq) from [<c0350b8c>] (exynos4_local_timer_setup+0xc0/0x13c) [ 40.786820] [<c0350b8c>] (exynos4_local_timer_setup) from [<c0350ca8>] (exynos4_mct_cpu_notify+0x30/0xa8) [ 40.786838] [<c0350ca8>] (exynos4_mct_cpu_notify) from [<c003b330>] (notifier_call_chain+0x44/0x84) [ 40.786857] [<c003b330>] (notifier_call_chain) from [<c0022fd4>] (__cpu_notify+0x28/0x44) [ 40.786873] [<c0022fd4>] (__cpu_notify) from [<c0013714>] (secondary_start_kernel+0xec/0x150) [ 40.786886] [<c0013714>] (secondary_start_kernel) from [<40008764>] (0x40008764) Solution: Clockevent irqs cannot be requested/freed every time cpu is hot-plugged/unplugged as CPU_STARTING/CPU_DYING notifications that signals hotplug or unplug events are sent with both preemption and local interrupts disabled. Since request_irq() may sleep it is moved to the initialization stage and performed for all possible cpus prior putting them online. Then, in the case of hotplug event the irq asociated with the given cpu will simply be enabled. Similarly on cpu unplug event the interrupt is not freed but just disabled. Note that after successful request_irq() call for a clockevent device associated to given cpu the requested irq is disabled (via disable_irq()). That is to make things symmetric as we expect hotplug event as a next thing (which will enable irq again). This should not pose any problems because at the time of requesting irq the clockevent device is not fully initialized yet, therefore should not produce interrupts at that point. For disabling an irq at cpu unplug notification disable_irq_nosync() is chosen which is a non-blocking function. This again shouldn't be a problem as prior sending CPU_DYING notification interrupts are migrated away from the cpu. Fixes: 7114cd7 ("clocksource: exynos_mct: use (request/free)_irq calls for local timer registration") Signed-off-by: Damian Eppel <[email protected]> Cc: <[email protected]> Reported-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Tested-by: Krzysztof Kozlowski <[email protected]> (Tested on Arndale Octa Board) Tested-by: Marcin Jabrzyk <[email protected]> (Tested on Rinato B2 (Exynos 3250) board)
aejsmith
pushed a commit
to aejsmith/linux
that referenced
this pull request
Jun 26, 2015
Switch to new UART driver from upstream This switches to the new UART driver that is now heading upstream (in Ralf's upstream-sfr tree). The reason this is done is because there still remain a few things that output to UART0 rather than UART4 (the dedicated UART header), like the early console. The current code has UART0 harcoded for this. The upstream driver allows the early console UART to be configured by DT, so take advantage of this by switching to that driver and setting stdout-path to UART4 in DT. All serial output from the kernel should now go to UART4 by default.
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 2, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 4, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
Sep 4, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 4, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 4, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 4, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 4, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 4, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 4, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 4, 2025
commit d3af6ca upstream. After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
Sep 5, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 9, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 10, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
MatthewCroughan
pushed a commit
to MatthewCroughan/linux
that referenced
this pull request
Sep 10, 2025
After hid_hw_start() is called hidinput_connect() will eventually be called to set up the device with the input layer since the HID_CONNECT_DEFAULT connect mask is used. During hidinput_connect() all input and output reports are processed and corresponding hid_inputs are allocated and configured via hidinput_configure_usages(). This process involves slot tagging report fields and configuring usages by setting relevant bits in the capability bitmaps. However it is possible that the capability bitmaps are not set at all leading to the subsequent hidinput_has_been_populated() check to fail leading to the freeing of the hid_input and the underlying input device. This becomes problematic because a malicious HID device like a ASUS ROG N-Key keyboard can trigger the above scenario via a specially crafted descriptor which then leads to a user-after-free when the name of the freed input device is written to later on after hid_hw_start(). Below, report 93 intentionally utilises the HID_UP_UNDEFINED Usage Page which is skipped during usage configuration, leading to the frees. 0x05, 0x0D, // Usage Page (Digitizer) 0x09, 0x05, // Usage (Touch Pad) 0xA1, 0x01, // Collection (Application) 0x85, 0x0D, // Report ID (13) 0x06, 0x00, 0xFF, // Usage Page (Vendor Defined 0xFF00) 0x09, 0xC5, // Usage (0xC5) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x04, // Report Count (4) 0xB1, 0x02, // Feature (Data,Var,Abs) 0x85, 0x5D, // Report ID (93) 0x06, 0x00, 0x00, // Usage Page (Undefined) 0x09, 0x01, // Usage (0x01) 0x15, 0x00, // Logical Minimum (0) 0x26, 0xFF, 0x00, // Logical Maximum (255) 0x75, 0x08, // Report Size (8) 0x95, 0x1B, // Report Count (27) 0x81, 0x02, // Input (Data,Var,Abs) 0xC0, // End Collection Below is the KASAN splat after triggering the UAF: [ 21.672709] ================================================================== [ 21.673700] BUG: KASAN: slab-use-after-free in asus_probe+0xeeb/0xf80 [ 21.673700] Write of size 8 at addr ffff88810a0ac000 by task kworker/1:2/54 [ 21.673700] [ 21.673700] CPU: 1 UID: 0 PID: 54 Comm: kworker/1:2 Not tainted 6.16.0-rc4-g9773391cf4dd-dirty torvalds#36 PREEMPT(voluntary) [ 21.673700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 21.673700] Call Trace: [ 21.673700] <TASK> [ 21.673700] dump_stack_lvl+0x5f/0x80 [ 21.673700] print_report+0xd1/0x660 [ 21.673700] kasan_report+0xe5/0x120 [ 21.673700] __asan_report_store8_noabort+0x1b/0x30 [ 21.673700] asus_probe+0xeeb/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Allocated by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_alloc_info+0x3b/0x50 [ 21.673700] __kasan_kmalloc+0x9c/0xa0 [ 21.673700] __kmalloc_cache_noprof+0x139/0x340 [ 21.673700] input_allocate_device+0x44/0x370 [ 21.673700] hidinput_connect+0xcb6/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] [ 21.673700] [ 21.673700] Freed by task 54: [ 21.673700] kasan_save_stack+0x3d/0x60 [ 21.673700] kasan_save_track+0x18/0x40 [ 21.673700] kasan_save_free_info+0x3f/0x60 [ 21.673700] __kasan_slab_free+0x3c/0x50 [ 21.673700] kfree+0xcf/0x350 [ 21.673700] input_dev_release+0xab/0xd0 [ 21.673700] device_release+0x9f/0x220 [ 21.673700] kobject_put+0x12b/0x220 [ 21.673700] put_device+0x12/0x20 [ 21.673700] input_free_device+0x4c/0xb0 [ 21.673700] hidinput_connect+0x1862/0x2630 [ 21.673700] hid_connect+0xf74/0x1d60 [ 21.673700] hid_hw_start+0x8c/0x110 [ 21.673700] asus_probe+0x5a3/0xf80 [ 21.673700] hid_device_probe+0x2ee/0x700 [ 21.673700] really_probe+0x1c6/0x6b0 [ 21.673700] __driver_probe_device+0x24f/0x310 [ 21.673700] driver_probe_device+0x4e/0x220 [...] Fixes: 9ce12d8 ("HID: asus: Add i2c touchpad support") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
Sep 11, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
Sep 12, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 13, 2025
[ Upstream commit 0bb2f7a ] When I ran the repro [0] and waited a few seconds, I observed two LOCKDEP splats: a warning immediately followed by a null-ptr-deref. [1] Reproduction Steps: 1) Mount CIFS 2) Add an iptables rule to drop incoming FIN packets for CIFS 3) Unmount CIFS 4) Unload the CIFS module 5) Remove the iptables rule At step 3), the CIFS module calls sock_release() for the underlying TCP socket, and it returns quickly. However, the socket remains in FIN_WAIT_1 because incoming FIN packets are dropped. At this point, the module's refcnt is 0 while the socket is still alive, so the following rmmod command succeeds. # ss -tan State Recv-Q Send-Q Local Address:Port Peer Address:Port FIN-WAIT-1 0 477 10.0.2.15:51062 10.0.0.137:445 # lsmod | grep cifs cifs 1159168 0 This highlights a discrepancy between the lifetime of the CIFS module and the underlying TCP socket. Even after CIFS calls sock_release() and it returns, the TCP socket does not die immediately in order to close the connection gracefully. While this is generally fine, it causes an issue with LOCKDEP because CIFS assigns a different lock class to the TCP socket's sk->sk_lock using sock_lock_init_class_and_name(). Once an incoming packet is processed for the socket or a timer fires, sk->sk_lock is acquired. Then, LOCKDEP checks the lock context in check_wait_context(), where hlock_class() is called to retrieve the lock class. However, since the module has already been unloaded, hlock_class() logs a warning and returns NULL, triggering the null-ptr-deref. If LOCKDEP is enabled, we must ensure that a module calling sock_lock_init_class_and_name() (CIFS, NFS, etc) cannot be unloaded while such a socket is still alive to prevent this issue. Let's hold the module reference in sock_lock_init_class_and_name() and release it when the socket is freed in sk_prot_free(). Note that sock_lock_init() clears sk->sk_owner for svc_create_socket() that calls sock_lock_init_class_and_name() for a listening socket, which clones a socket by sk_clone_lock() without GFP_ZERO. [0]: CIFS_SERVER="10.0.0.137" CIFS_PATH="//${CIFS_SERVER}/Users/Administrator/Desktop/CIFS_TEST" DEV="enp0s3" CRED="/root/WindowsCredential.txt" MNT=$(mktemp -d /tmp/XXXXXX) mount -t cifs ${CIFS_PATH} ${MNT} -o vers=3.0,credentials=${CRED},cache=none,echo_interval=1 iptables -A INPUT -s ${CIFS_SERVER} -j DROP for i in $(seq 10); do umount ${MNT} rmmod cifs sleep 1 done rm -r ${MNT} iptables -D INPUT -s ${CIFS_SERVER} -j DROP [1]: DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 10 PID: 0 at kernel/locking/lockdep.c:234 hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Not tainted 6.14.0 torvalds#36 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) ... Call Trace: <IRQ> __lock_acquire (kernel/locking/lockdep.c:4853 kernel/locking/lockdep.c:5178) lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ... BUG: kernel NULL pointer dereference, address: 00000000000000c4 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Tainted: G W 6.14.0 torvalds#36 Tainted: [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire (kernel/locking/lockdep.c:4852 kernel/locking/lockdep.c:5178) Code: 15 41 09 c7 41 8b 44 24 20 25 ff 1f 00 00 41 09 c7 8b 84 24 a0 00 00 00 45 89 7c 24 20 41 89 44 24 24 e8 e1 bc ff ff 4c 89 e7 <44> 0f b6 b8 c4 00 00 00 e8 d1 bc ff ff 0f b6 80 c5 00 00 00 88 44 RSP: 0018:ffa0000000468a10 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ff1100010091cc38 RCX: 0000000000000027 RDX: ff1100081f09ca48 RSI: 0000000000000001 RDI: ff1100010091cc88 RBP: ff1100010091c200 R08: ff1100083fe6e228 R09: 00000000ffffbfff R10: ff1100081eca0000 R11: ff1100083fe10dc0 R12: ff1100010091cc88 R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000424b1 FS: 0000000000000000(0000) GS:ff1100081f080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c4 CR3: 0000000002c4a003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) ip_local_deliver_finish (./include/linux/rcupdate.h:878 net/ipv4/ip_input.c:234) ip_sublist_rcv_finish (net/ipv4/ip_input.c:576) ip_list_rcv_finish (net/ipv4/ip_input.c:628) ip_list_rcv (net/ipv4/ip_input.c:670) __netif_receive_skb_list_core (net/core/dev.c:5939 net/core/dev.c:5986) netif_receive_skb_list_internal (net/core/dev.c:6040 net/core/dev.c:6129) napi_complete_done (./include/linux/list.h:37 ./include/net/gro.h:519 ./include/net/gro.h:514 net/core/dev.c:6496) e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3815) __napi_poll.constprop.0 (net/core/dev.c:7191) net_rx_action (net/core/dev.c:7262 net/core/dev.c:7382) handle_softirqs (kernel/softirq.c:561) __irq_exit_rcu (kernel/softirq.c:596 kernel/softirq.c:435 kernel/softirq.c:662) irq_exit_rcu (kernel/softirq.c:680) common_interrupt (arch/x86/kernel/irq.c:280 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:693) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:744) Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d c3 2b 15 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffa00000000ffee8 EFLAGS: 00000202 RAX: 000000000000640b RBX: ff1100010091c200 RCX: 0000000000061aa4 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff812f30c5 RBP: 000000000000000a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) cpu_startup_entry (kernel/sched/idle.c:422 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:315) common_startup_64 (arch/x86/kernel/head_64.S:421) </TASK> Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CR2: 00000000000000c4 Fixes: ed07536 ("[PATCH] lockdep: annotate nfs/nfsd in-kernel sockets") Signed-off-by: Kuniyuki Iwashima <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> [ Adjust context ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
bjackman
pushed a commit
to bjackman/linux
that referenced
this pull request
Sep 14, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 14, 2025
[ Upstream commit 0bb2f7a ] When I ran the repro [0] and waited a few seconds, I observed two LOCKDEP splats: a warning immediately followed by a null-ptr-deref. [1] Reproduction Steps: 1) Mount CIFS 2) Add an iptables rule to drop incoming FIN packets for CIFS 3) Unmount CIFS 4) Unload the CIFS module 5) Remove the iptables rule At step 3), the CIFS module calls sock_release() for the underlying TCP socket, and it returns quickly. However, the socket remains in FIN_WAIT_1 because incoming FIN packets are dropped. At this point, the module's refcnt is 0 while the socket is still alive, so the following rmmod command succeeds. # ss -tan State Recv-Q Send-Q Local Address:Port Peer Address:Port FIN-WAIT-1 0 477 10.0.2.15:51062 10.0.0.137:445 # lsmod | grep cifs cifs 1159168 0 This highlights a discrepancy between the lifetime of the CIFS module and the underlying TCP socket. Even after CIFS calls sock_release() and it returns, the TCP socket does not die immediately in order to close the connection gracefully. While this is generally fine, it causes an issue with LOCKDEP because CIFS assigns a different lock class to the TCP socket's sk->sk_lock using sock_lock_init_class_and_name(). Once an incoming packet is processed for the socket or a timer fires, sk->sk_lock is acquired. Then, LOCKDEP checks the lock context in check_wait_context(), where hlock_class() is called to retrieve the lock class. However, since the module has already been unloaded, hlock_class() logs a warning and returns NULL, triggering the null-ptr-deref. If LOCKDEP is enabled, we must ensure that a module calling sock_lock_init_class_and_name() (CIFS, NFS, etc) cannot be unloaded while such a socket is still alive to prevent this issue. Let's hold the module reference in sock_lock_init_class_and_name() and release it when the socket is freed in sk_prot_free(). Note that sock_lock_init() clears sk->sk_owner for svc_create_socket() that calls sock_lock_init_class_and_name() for a listening socket, which clones a socket by sk_clone_lock() without GFP_ZERO. [0]: CIFS_SERVER="10.0.0.137" CIFS_PATH="//${CIFS_SERVER}/Users/Administrator/Desktop/CIFS_TEST" DEV="enp0s3" CRED="/root/WindowsCredential.txt" MNT=$(mktemp -d /tmp/XXXXXX) mount -t cifs ${CIFS_PATH} ${MNT} -o vers=3.0,credentials=${CRED},cache=none,echo_interval=1 iptables -A INPUT -s ${CIFS_SERVER} -j DROP for i in $(seq 10); do umount ${MNT} rmmod cifs sleep 1 done rm -r ${MNT} iptables -D INPUT -s ${CIFS_SERVER} -j DROP [1]: DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 10 PID: 0 at kernel/locking/lockdep.c:234 hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Not tainted 6.14.0 torvalds#36 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) ... Call Trace: <IRQ> __lock_acquire (kernel/locking/lockdep.c:4853 kernel/locking/lockdep.c:5178) lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ... BUG: kernel NULL pointer dereference, address: 00000000000000c4 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Tainted: G W 6.14.0 torvalds#36 Tainted: [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire (kernel/locking/lockdep.c:4852 kernel/locking/lockdep.c:5178) Code: 15 41 09 c7 41 8b 44 24 20 25 ff 1f 00 00 41 09 c7 8b 84 24 a0 00 00 00 45 89 7c 24 20 41 89 44 24 24 e8 e1 bc ff ff 4c 89 e7 <44> 0f b6 b8 c4 00 00 00 e8 d1 bc ff ff 0f b6 80 c5 00 00 00 88 44 RSP: 0018:ffa0000000468a10 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ff1100010091cc38 RCX: 0000000000000027 RDX: ff1100081f09ca48 RSI: 0000000000000001 RDI: ff1100010091cc88 RBP: ff1100010091c200 R08: ff1100083fe6e228 R09: 00000000ffffbfff R10: ff1100081eca0000 R11: ff1100083fe10dc0 R12: ff1100010091cc88 R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000424b1 FS: 0000000000000000(0000) GS:ff1100081f080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c4 CR3: 0000000002c4a003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) ip_local_deliver_finish (./include/linux/rcupdate.h:878 net/ipv4/ip_input.c:234) ip_sublist_rcv_finish (net/ipv4/ip_input.c:576) ip_list_rcv_finish (net/ipv4/ip_input.c:628) ip_list_rcv (net/ipv4/ip_input.c:670) __netif_receive_skb_list_core (net/core/dev.c:5939 net/core/dev.c:5986) netif_receive_skb_list_internal (net/core/dev.c:6040 net/core/dev.c:6129) napi_complete_done (./include/linux/list.h:37 ./include/net/gro.h:519 ./include/net/gro.h:514 net/core/dev.c:6496) e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3815) __napi_poll.constprop.0 (net/core/dev.c:7191) net_rx_action (net/core/dev.c:7262 net/core/dev.c:7382) handle_softirqs (kernel/softirq.c:561) __irq_exit_rcu (kernel/softirq.c:596 kernel/softirq.c:435 kernel/softirq.c:662) irq_exit_rcu (kernel/softirq.c:680) common_interrupt (arch/x86/kernel/irq.c:280 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:693) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:744) Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d c3 2b 15 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffa00000000ffee8 EFLAGS: 00000202 RAX: 000000000000640b RBX: ff1100010091c200 RCX: 0000000000061aa4 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff812f30c5 RBP: 000000000000000a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) cpu_startup_entry (kernel/sched/idle.c:422 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:315) common_startup_64 (arch/x86/kernel/head_64.S:421) </TASK> Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CR2: 00000000000000c4 Fixes: ed07536 ("[PATCH] lockdep: annotate nfs/nfsd in-kernel sockets") Signed-off-by: Kuniyuki Iwashima <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> [ Adjust context ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
bjackman
pushed a commit
to bjackman/linux
that referenced
this pull request
Sep 15, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 16, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 17, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 17, 2025
[ Upstream commit 0bb2f7a ] When I ran the repro [0] and waited a few seconds, I observed two LOCKDEP splats: a warning immediately followed by a null-ptr-deref. [1] Reproduction Steps: 1) Mount CIFS 2) Add an iptables rule to drop incoming FIN packets for CIFS 3) Unmount CIFS 4) Unload the CIFS module 5) Remove the iptables rule At step 3), the CIFS module calls sock_release() for the underlying TCP socket, and it returns quickly. However, the socket remains in FIN_WAIT_1 because incoming FIN packets are dropped. At this point, the module's refcnt is 0 while the socket is still alive, so the following rmmod command succeeds. # ss -tan State Recv-Q Send-Q Local Address:Port Peer Address:Port FIN-WAIT-1 0 477 10.0.2.15:51062 10.0.0.137:445 # lsmod | grep cifs cifs 1159168 0 This highlights a discrepancy between the lifetime of the CIFS module and the underlying TCP socket. Even after CIFS calls sock_release() and it returns, the TCP socket does not die immediately in order to close the connection gracefully. While this is generally fine, it causes an issue with LOCKDEP because CIFS assigns a different lock class to the TCP socket's sk->sk_lock using sock_lock_init_class_and_name(). Once an incoming packet is processed for the socket or a timer fires, sk->sk_lock is acquired. Then, LOCKDEP checks the lock context in check_wait_context(), where hlock_class() is called to retrieve the lock class. However, since the module has already been unloaded, hlock_class() logs a warning and returns NULL, triggering the null-ptr-deref. If LOCKDEP is enabled, we must ensure that a module calling sock_lock_init_class_and_name() (CIFS, NFS, etc) cannot be unloaded while such a socket is still alive to prevent this issue. Let's hold the module reference in sock_lock_init_class_and_name() and release it when the socket is freed in sk_prot_free(). Note that sock_lock_init() clears sk->sk_owner for svc_create_socket() that calls sock_lock_init_class_and_name() for a listening socket, which clones a socket by sk_clone_lock() without GFP_ZERO. [0]: CIFS_SERVER="10.0.0.137" CIFS_PATH="//${CIFS_SERVER}/Users/Administrator/Desktop/CIFS_TEST" DEV="enp0s3" CRED="/root/WindowsCredential.txt" MNT=$(mktemp -d /tmp/XXXXXX) mount -t cifs ${CIFS_PATH} ${MNT} -o vers=3.0,credentials=${CRED},cache=none,echo_interval=1 iptables -A INPUT -s ${CIFS_SERVER} -j DROP for i in $(seq 10); do umount ${MNT} rmmod cifs sleep 1 done rm -r ${MNT} iptables -D INPUT -s ${CIFS_SERVER} -j DROP [1]: DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 10 PID: 0 at kernel/locking/lockdep.c:234 hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Not tainted 6.14.0 torvalds#36 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) ... Call Trace: <IRQ> __lock_acquire (kernel/locking/lockdep.c:4853 kernel/locking/lockdep.c:5178) lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ... BUG: kernel NULL pointer dereference, address: 00000000000000c4 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Tainted: G W 6.14.0 torvalds#36 Tainted: [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire (kernel/locking/lockdep.c:4852 kernel/locking/lockdep.c:5178) Code: 15 41 09 c7 41 8b 44 24 20 25 ff 1f 00 00 41 09 c7 8b 84 24 a0 00 00 00 45 89 7c 24 20 41 89 44 24 24 e8 e1 bc ff ff 4c 89 e7 <44> 0f b6 b8 c4 00 00 00 e8 d1 bc ff ff 0f b6 80 c5 00 00 00 88 44 RSP: 0018:ffa0000000468a10 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ff1100010091cc38 RCX: 0000000000000027 RDX: ff1100081f09ca48 RSI: 0000000000000001 RDI: ff1100010091cc88 RBP: ff1100010091c200 R08: ff1100083fe6e228 R09: 00000000ffffbfff R10: ff1100081eca0000 R11: ff1100083fe10dc0 R12: ff1100010091cc88 R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000424b1 FS: 0000000000000000(0000) GS:ff1100081f080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c4 CR3: 0000000002c4a003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) ip_local_deliver_finish (./include/linux/rcupdate.h:878 net/ipv4/ip_input.c:234) ip_sublist_rcv_finish (net/ipv4/ip_input.c:576) ip_list_rcv_finish (net/ipv4/ip_input.c:628) ip_list_rcv (net/ipv4/ip_input.c:670) __netif_receive_skb_list_core (net/core/dev.c:5939 net/core/dev.c:5986) netif_receive_skb_list_internal (net/core/dev.c:6040 net/core/dev.c:6129) napi_complete_done (./include/linux/list.h:37 ./include/net/gro.h:519 ./include/net/gro.h:514 net/core/dev.c:6496) e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3815) __napi_poll.constprop.0 (net/core/dev.c:7191) net_rx_action (net/core/dev.c:7262 net/core/dev.c:7382) handle_softirqs (kernel/softirq.c:561) __irq_exit_rcu (kernel/softirq.c:596 kernel/softirq.c:435 kernel/softirq.c:662) irq_exit_rcu (kernel/softirq.c:680) common_interrupt (arch/x86/kernel/irq.c:280 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:693) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:744) Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d c3 2b 15 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffa00000000ffee8 EFLAGS: 00000202 RAX: 000000000000640b RBX: ff1100010091c200 RCX: 0000000000061aa4 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff812f30c5 RBP: 000000000000000a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) cpu_startup_entry (kernel/sched/idle.c:422 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:315) common_startup_64 (arch/x86/kernel/head_64.S:421) </TASK> Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CR2: 00000000000000c4 Fixes: ed07536 ("[PATCH] lockdep: annotate nfs/nfsd in-kernel sockets") Signed-off-by: Kuniyuki Iwashima <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> [ Adjust context ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 18, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
bjackman
pushed a commit
to bjackman/linux
that referenced
this pull request
Sep 19, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 19, 2025
[ Upstream commit 0bb2f7a ] When I ran the repro [0] and waited a few seconds, I observed two LOCKDEP splats: a warning immediately followed by a null-ptr-deref. [1] Reproduction Steps: 1) Mount CIFS 2) Add an iptables rule to drop incoming FIN packets for CIFS 3) Unmount CIFS 4) Unload the CIFS module 5) Remove the iptables rule At step 3), the CIFS module calls sock_release() for the underlying TCP socket, and it returns quickly. However, the socket remains in FIN_WAIT_1 because incoming FIN packets are dropped. At this point, the module's refcnt is 0 while the socket is still alive, so the following rmmod command succeeds. # ss -tan State Recv-Q Send-Q Local Address:Port Peer Address:Port FIN-WAIT-1 0 477 10.0.2.15:51062 10.0.0.137:445 # lsmod | grep cifs cifs 1159168 0 This highlights a discrepancy between the lifetime of the CIFS module and the underlying TCP socket. Even after CIFS calls sock_release() and it returns, the TCP socket does not die immediately in order to close the connection gracefully. While this is generally fine, it causes an issue with LOCKDEP because CIFS assigns a different lock class to the TCP socket's sk->sk_lock using sock_lock_init_class_and_name(). Once an incoming packet is processed for the socket or a timer fires, sk->sk_lock is acquired. Then, LOCKDEP checks the lock context in check_wait_context(), where hlock_class() is called to retrieve the lock class. However, since the module has already been unloaded, hlock_class() logs a warning and returns NULL, triggering the null-ptr-deref. If LOCKDEP is enabled, we must ensure that a module calling sock_lock_init_class_and_name() (CIFS, NFS, etc) cannot be unloaded while such a socket is still alive to prevent this issue. Let's hold the module reference in sock_lock_init_class_and_name() and release it when the socket is freed in sk_prot_free(). Note that sock_lock_init() clears sk->sk_owner for svc_create_socket() that calls sock_lock_init_class_and_name() for a listening socket, which clones a socket by sk_clone_lock() without GFP_ZERO. [0]: CIFS_SERVER="10.0.0.137" CIFS_PATH="//${CIFS_SERVER}/Users/Administrator/Desktop/CIFS_TEST" DEV="enp0s3" CRED="/root/WindowsCredential.txt" MNT=$(mktemp -d /tmp/XXXXXX) mount -t cifs ${CIFS_PATH} ${MNT} -o vers=3.0,credentials=${CRED},cache=none,echo_interval=1 iptables -A INPUT -s ${CIFS_SERVER} -j DROP for i in $(seq 10); do umount ${MNT} rmmod cifs sleep 1 done rm -r ${MNT} iptables -D INPUT -s ${CIFS_SERVER} -j DROP [1]: DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 10 PID: 0 at kernel/locking/lockdep.c:234 hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Not tainted 6.14.0 torvalds#36 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) ... Call Trace: <IRQ> __lock_acquire (kernel/locking/lockdep.c:4853 kernel/locking/lockdep.c:5178) lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ... BUG: kernel NULL pointer dereference, address: 00000000000000c4 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Tainted: G W 6.14.0 torvalds#36 Tainted: [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire (kernel/locking/lockdep.c:4852 kernel/locking/lockdep.c:5178) Code: 15 41 09 c7 41 8b 44 24 20 25 ff 1f 00 00 41 09 c7 8b 84 24 a0 00 00 00 45 89 7c 24 20 41 89 44 24 24 e8 e1 bc ff ff 4c 89 e7 <44> 0f b6 b8 c4 00 00 00 e8 d1 bc ff ff 0f b6 80 c5 00 00 00 88 44 RSP: 0018:ffa0000000468a10 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ff1100010091cc38 RCX: 0000000000000027 RDX: ff1100081f09ca48 RSI: 0000000000000001 RDI: ff1100010091cc88 RBP: ff1100010091c200 R08: ff1100083fe6e228 R09: 00000000ffffbfff R10: ff1100081eca0000 R11: ff1100083fe10dc0 R12: ff1100010091cc88 R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000424b1 FS: 0000000000000000(0000) GS:ff1100081f080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c4 CR3: 0000000002c4a003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) ip_local_deliver_finish (./include/linux/rcupdate.h:878 net/ipv4/ip_input.c:234) ip_sublist_rcv_finish (net/ipv4/ip_input.c:576) ip_list_rcv_finish (net/ipv4/ip_input.c:628) ip_list_rcv (net/ipv4/ip_input.c:670) __netif_receive_skb_list_core (net/core/dev.c:5939 net/core/dev.c:5986) netif_receive_skb_list_internal (net/core/dev.c:6040 net/core/dev.c:6129) napi_complete_done (./include/linux/list.h:37 ./include/net/gro.h:519 ./include/net/gro.h:514 net/core/dev.c:6496) e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3815) __napi_poll.constprop.0 (net/core/dev.c:7191) net_rx_action (net/core/dev.c:7262 net/core/dev.c:7382) handle_softirqs (kernel/softirq.c:561) __irq_exit_rcu (kernel/softirq.c:596 kernel/softirq.c:435 kernel/softirq.c:662) irq_exit_rcu (kernel/softirq.c:680) common_interrupt (arch/x86/kernel/irq.c:280 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:693) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:744) Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d c3 2b 15 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffa00000000ffee8 EFLAGS: 00000202 RAX: 000000000000640b RBX: ff1100010091c200 RCX: 0000000000061aa4 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff812f30c5 RBP: 000000000000000a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) cpu_startup_entry (kernel/sched/idle.c:422 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:315) common_startup_64 (arch/x86/kernel/head_64.S:421) </TASK> Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CR2: 00000000000000c4 Fixes: ed07536 ("[PATCH] lockdep: annotate nfs/nfsd in-kernel sockets") Signed-off-by: Kuniyuki Iwashima <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> [ Adjust context ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 19, 2025
[ Upstream commit 0bb2f7a ] When I ran the repro [0] and waited a few seconds, I observed two LOCKDEP splats: a warning immediately followed by a null-ptr-deref. [1] Reproduction Steps: 1) Mount CIFS 2) Add an iptables rule to drop incoming FIN packets for CIFS 3) Unmount CIFS 4) Unload the CIFS module 5) Remove the iptables rule At step 3), the CIFS module calls sock_release() for the underlying TCP socket, and it returns quickly. However, the socket remains in FIN_WAIT_1 because incoming FIN packets are dropped. At this point, the module's refcnt is 0 while the socket is still alive, so the following rmmod command succeeds. # ss -tan State Recv-Q Send-Q Local Address:Port Peer Address:Port FIN-WAIT-1 0 477 10.0.2.15:51062 10.0.0.137:445 # lsmod | grep cifs cifs 1159168 0 This highlights a discrepancy between the lifetime of the CIFS module and the underlying TCP socket. Even after CIFS calls sock_release() and it returns, the TCP socket does not die immediately in order to close the connection gracefully. While this is generally fine, it causes an issue with LOCKDEP because CIFS assigns a different lock class to the TCP socket's sk->sk_lock using sock_lock_init_class_and_name(). Once an incoming packet is processed for the socket or a timer fires, sk->sk_lock is acquired. Then, LOCKDEP checks the lock context in check_wait_context(), where hlock_class() is called to retrieve the lock class. However, since the module has already been unloaded, hlock_class() logs a warning and returns NULL, triggering the null-ptr-deref. If LOCKDEP is enabled, we must ensure that a module calling sock_lock_init_class_and_name() (CIFS, NFS, etc) cannot be unloaded while such a socket is still alive to prevent this issue. Let's hold the module reference in sock_lock_init_class_and_name() and release it when the socket is freed in sk_prot_free(). Note that sock_lock_init() clears sk->sk_owner for svc_create_socket() that calls sock_lock_init_class_and_name() for a listening socket, which clones a socket by sk_clone_lock() without GFP_ZERO. [0]: CIFS_SERVER="10.0.0.137" CIFS_PATH="//${CIFS_SERVER}/Users/Administrator/Desktop/CIFS_TEST" DEV="enp0s3" CRED="/root/WindowsCredential.txt" MNT=$(mktemp -d /tmp/XXXXXX) mount -t cifs ${CIFS_PATH} ${MNT} -o vers=3.0,credentials=${CRED},cache=none,echo_interval=1 iptables -A INPUT -s ${CIFS_SERVER} -j DROP for i in $(seq 10); do umount ${MNT} rmmod cifs sleep 1 done rm -r ${MNT} iptables -D INPUT -s ${CIFS_SERVER} -j DROP [1]: DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 10 PID: 0 at kernel/locking/lockdep.c:234 hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Not tainted 6.14.0 torvalds#36 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) ... Call Trace: <IRQ> __lock_acquire (kernel/locking/lockdep.c:4853 kernel/locking/lockdep.c:5178) lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ... BUG: kernel NULL pointer dereference, address: 00000000000000c4 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Tainted: G W 6.14.0 torvalds#36 Tainted: [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire (kernel/locking/lockdep.c:4852 kernel/locking/lockdep.c:5178) Code: 15 41 09 c7 41 8b 44 24 20 25 ff 1f 00 00 41 09 c7 8b 84 24 a0 00 00 00 45 89 7c 24 20 41 89 44 24 24 e8 e1 bc ff ff 4c 89 e7 <44> 0f b6 b8 c4 00 00 00 e8 d1 bc ff ff 0f b6 80 c5 00 00 00 88 44 RSP: 0018:ffa0000000468a10 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ff1100010091cc38 RCX: 0000000000000027 RDX: ff1100081f09ca48 RSI: 0000000000000001 RDI: ff1100010091cc88 RBP: ff1100010091c200 R08: ff1100083fe6e228 R09: 00000000ffffbfff R10: ff1100081eca0000 R11: ff1100083fe10dc0 R12: ff1100010091cc88 R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000424b1 FS: 0000000000000000(0000) GS:ff1100081f080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c4 CR3: 0000000002c4a003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) ip_local_deliver_finish (./include/linux/rcupdate.h:878 net/ipv4/ip_input.c:234) ip_sublist_rcv_finish (net/ipv4/ip_input.c:576) ip_list_rcv_finish (net/ipv4/ip_input.c:628) ip_list_rcv (net/ipv4/ip_input.c:670) __netif_receive_skb_list_core (net/core/dev.c:5939 net/core/dev.c:5986) netif_receive_skb_list_internal (net/core/dev.c:6040 net/core/dev.c:6129) napi_complete_done (./include/linux/list.h:37 ./include/net/gro.h:519 ./include/net/gro.h:514 net/core/dev.c:6496) e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3815) __napi_poll.constprop.0 (net/core/dev.c:7191) net_rx_action (net/core/dev.c:7262 net/core/dev.c:7382) handle_softirqs (kernel/softirq.c:561) __irq_exit_rcu (kernel/softirq.c:596 kernel/softirq.c:435 kernel/softirq.c:662) irq_exit_rcu (kernel/softirq.c:680) common_interrupt (arch/x86/kernel/irq.c:280 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:693) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:744) Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d c3 2b 15 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffa00000000ffee8 EFLAGS: 00000202 RAX: 000000000000640b RBX: ff1100010091c200 RCX: 0000000000061aa4 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff812f30c5 RBP: 000000000000000a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) cpu_startup_entry (kernel/sched/idle.c:422 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:315) common_startup_64 (arch/x86/kernel/head_64.S:421) </TASK> Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CR2: 00000000000000c4 Fixes: ed07536 ("[PATCH] lockdep: annotate nfs/nfsd in-kernel sockets") Signed-off-by: Kuniyuki Iwashima <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> [ Adjust context ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
bjackman
pushed a commit
to bjackman/linux
that referenced
this pull request
Sep 20, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
Sep 21, 2025
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch torvalds#6 -> torvalds#13 : disallow folios to have non-contiguous pages Patch torvalds#14 -> torvalds#20 : remove nth_page() usage within folios Patch torvalds#22 : disallow CMA allocations of non-contiguous pages Patch torvalds#23 -> torvalds#33 : sanity+check + remove nth_page() usage within SG entry Patch torvalds#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch torvalds#35 : remove nth_page() in kfence Patch torvalds#36 : adjust stale comment regarding nth_page Patch torvalds#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Zi Yan <[email protected]> Acked-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: SeongJae Park <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Alex Willamson <[email protected]> Cc: Bart van Assche <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Brendan Jackman <[email protected]> Cc: Brett Creeley <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christoph Lameter (Ampere) <[email protected]> Cc: Damien Le Maol <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Doug Gilbert <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Inki Dae <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Lahtinen <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Lars Persson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marco Elver <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Maxim Levitky <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Niklas Cassel <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Pavel Begunkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Robin Murohy <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yishai Hadas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.