Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

RevySR
Copy link

@RevySR RevySR commented Jul 31, 2025

No description provided.

KexyBiscuit pushed a commit that referenced this pull request Jul 31, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
KexyBiscuit pushed a commit that referenced this pull request Jul 31, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
KexyBiscuit pushed a commit that referenced this pull request Jul 31, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
KexyBiscuit pushed a commit that referenced this pull request Aug 1, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
@KexyBiscuit KexyBiscuit force-pushed the aosc/v6.16 branch 2 times, most recently from 76821ec to e7e5f1f Compare August 3, 2025 21:24
KexyBiscuit pushed a commit that referenced this pull request Aug 3, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
@KexyBiscuit KexyBiscuit force-pushed the aosc/v6.16 branch 2 times, most recently from e7333bd to 772bca6 Compare August 6, 2025 11:22
KexyBiscuit pushed a commit that referenced this pull request Aug 6, 2025
As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20, ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/[email protected]

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
KexyBiscuit pushed a commit that referenced this pull request Aug 6, 2025
Patch series "extend hung task blocker tracking to rwsems".

Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).

The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker().  The hung task
detector can then query the rwsem's owner to identify the lock holder.

Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.

The main challenge lies with readers.  The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain.  This is not a
significant issue, however.  In practice, long-lasting lock contention is
almost always caused by a writer.  Therefore, reliably tracking the writer
is the primary goal of this patch series ;)

With this change, the hung task detector can now show blocker task's info
like below:

[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 blocked for more than 122 seconds.
[Fri Jun 27 15:21:34 2025]       Tainted: G S                  6.16.0-rc3 #8
[Fri Jun 27 15:21:34 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 27 15:21:34 2025] task:cat             state:D stack:0     pid:28631 tgid:28631 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? policy_nodemask+0x215/0x340
[Fri Jun 27 15:21:34 2025]  ? _raw_spin_lock_irq+0x8a/0xe0
[Fri Jun 27 15:21:34 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_preempt_disabled+0x15/0x30
[Fri Jun 27 15:21:34 2025]  rwsem_down_read_slowpath+0x55e/0xe10
[Fri Jun 27 15:21:34 2025]  ? __pfx_rwsem_down_read_slowpath+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx___might_resched+0x10/0x10
[Fri Jun 27 15:21:34 2025]  down_read+0xc9/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_down_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __debugfs_file_get+0x14d/0x700
[Fri Jun 27 15:21:34 2025]  ? __pfx___debugfs_file_get+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? handle_pte_fault+0x52a/0x710
[Fri Jun 27 15:21:34 2025]  ? selinux_file_permission+0x3a9/0x590
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_read+0x4a/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f3f8faefb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffdeda5ab98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3f8faefb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 00000000010fa000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 00000000010fa000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffdeda59fe0 R11: 0000000000000246 R12: 00000000010fa000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>
[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 <reader> blocked on an rw-semaphore likely owned by task cat:28630 <writer>
[Fri Jun 27 15:21:34 2025] task:cat             state:S stack:0     pid:28630 tgid:28630 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __mod_timer+0x304/0xa80
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_timeout+0xfb/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_schedule_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx_process_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? down_write+0xc4/0x140
[Fri Jun 27 15:21:34 2025]  msleep_interruptible+0xbe/0x150
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_write+0x54/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f8f288efb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffffb631038 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f8f288efb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 000000002a4b5000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 000000002a4b5000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffffb630460 R11: 0000000000000246 R12: 000000002a4b5000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>


This patch (of 3):

In preparation for extending blocker tracking to support rwsems, make the
rwsem_owner() and is_rwsem_reader_owned() helpers globally available for
determining if the blocker is a writer or one of the readers.

Additionally, a stale owner pointer in a reader-owned rwsem can lead to
false positives in blocker tracking when CONFIG_DETECT_HUNG_TASK_BLOCKER
is enabled.  To mitigate this, clear the owner field on the reader unlock
path, similar to what CONFIG_DEBUG_RWSEMS does.  A NULL owner is better
than a stale one for diagnostics.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/ [1]
Signed-off-by: Lance Yang <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Cc: Anna Schumaker <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Joel Granados <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mingzhe Yang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Tomasz Figa <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yongliang Gao <[email protected]>
Cc: Zi Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
KexyBiscuit pushed a commit that referenced this pull request Aug 6, 2025
Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).

The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker().  The hung task
detector can then query the rwsem's owner to identify the lock holder.

Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.

The main challenge lies with readers.  The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain.  This is not a
significant issue, however.  In practice, long-lasting lock contention is
almost always caused by a writer.  Therefore, reliably tracking the writer
is the primary goal of this patch series ;)

With this change, the hung task detector can now show blocker task's info
like below:

[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 blocked for more than 122 seconds.
[Fri Jun 27 15:21:34 2025]       Tainted: G S                  6.16.0-rc3 #8
[Fri Jun 27 15:21:34 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 27 15:21:34 2025] task:cat             state:D stack:0     pid:28631 tgid:28631 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? policy_nodemask+0x215/0x340
[Fri Jun 27 15:21:34 2025]  ? _raw_spin_lock_irq+0x8a/0xe0
[Fri Jun 27 15:21:34 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_preempt_disabled+0x15/0x30
[Fri Jun 27 15:21:34 2025]  rwsem_down_read_slowpath+0x55e/0xe10
[Fri Jun 27 15:21:34 2025]  ? __pfx_rwsem_down_read_slowpath+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx___might_resched+0x10/0x10
[Fri Jun 27 15:21:34 2025]  down_read+0xc9/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_down_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __debugfs_file_get+0x14d/0x700
[Fri Jun 27 15:21:34 2025]  ? __pfx___debugfs_file_get+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? handle_pte_fault+0x52a/0x710
[Fri Jun 27 15:21:34 2025]  ? selinux_file_permission+0x3a9/0x590
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_read+0x4a/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f3f8faefb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffdeda5ab98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3f8faefb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 00000000010fa000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 00000000010fa000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffdeda59fe0 R11: 0000000000000246 R12: 00000000010fa000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>
[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 <reader> blocked on an rw-semaphore likely owned by task cat:28630 <writer>
[Fri Jun 27 15:21:34 2025] task:cat             state:S stack:0     pid:28630 tgid:28630 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __mod_timer+0x304/0xa80
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_timeout+0xfb/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_schedule_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx_process_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? down_write+0xc4/0x140
[Fri Jun 27 15:21:34 2025]  msleep_interruptible+0xbe/0x150
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_write+0x54/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f8f288efb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffffb631038 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f8f288efb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 000000002a4b5000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 000000002a4b5000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffffb630460 R11: 0000000000000246 R12: 000000002a4b5000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>

[1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Lance Yang <[email protected]>
Suggested-by: Masami Hiramatsu (Google) <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Cc: Anna Schumaker <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Joel Granados <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mingzhe Yang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Tomasz Figa <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yongliang Gao <[email protected]>
Cc: Zi Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
@KexyBiscuit KexyBiscuit force-pushed the aosc/v6.16 branch 6 times, most recently from e6c3646 to 2934eba Compare August 7, 2025 12:33
@RevySR RevySR force-pushed the aosc/sg2042/v6.16.y branch from fd46b48 to b248cba Compare August 11, 2025 18:00
MingcongBai pushed a commit that referenced this pull request Aug 13, 2025
[ Upstream commit 16d8fd7 ]

In rtl8187_stop() move the call of usb_kill_anchored_urbs() before clearing
b_tx_status.queue. This change prevents callbacks from using already freed
skb due to anchor was not killed before freeing such skb.

 BUG: kernel NULL pointer dereference, address: 0000000000000080
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.15.0 #8 PREEMPT(voluntary)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
 RIP: 0010:ieee80211_tx_status_irqsafe+0x21/0xc0 [mac80211]
 Call Trace:
  <IRQ>
  rtl8187_tx_cb+0x116/0x150 [rtl8187]
  __usb_hcd_giveback_urb+0x9d/0x120
  usb_giveback_urb_bh+0xbb/0x140
  process_one_work+0x19b/0x3c0
  bh_worker+0x1a7/0x210
  tasklet_action+0x10/0x30
  handle_softirqs+0xf0/0x340
  __irq_exit_rcu+0xcd/0xf0
  common_interrupt+0x85/0xa0
  </IRQ>

Tested on RTL8187BvE device.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c1db52b ("rtl8187: Use usb anchor facilities to manage urbs")
Signed-off-by: Daniil Dulov <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai pushed a commit that referenced this pull request Aug 13, 2025
[ Upstream commit a509a55 ]

As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20, ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/[email protected]

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 13, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai pushed a commit that referenced this pull request Aug 17, 2025
[ Upstream commit 16d8fd7 ]

In rtl8187_stop() move the call of usb_kill_anchored_urbs() before clearing
b_tx_status.queue. This change prevents callbacks from using already freed
skb due to anchor was not killed before freeing such skb.

 BUG: kernel NULL pointer dereference, address: 0000000000000080
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.15.0 #8 PREEMPT(voluntary)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
 RIP: 0010:ieee80211_tx_status_irqsafe+0x21/0xc0 [mac80211]
 Call Trace:
  <IRQ>
  rtl8187_tx_cb+0x116/0x150 [rtl8187]
  __usb_hcd_giveback_urb+0x9d/0x120
  usb_giveback_urb_bh+0xbb/0x140
  process_one_work+0x19b/0x3c0
  bh_worker+0x1a7/0x210
  tasklet_action+0x10/0x30
  handle_softirqs+0xf0/0x340
  __irq_exit_rcu+0xcd/0xf0
  common_interrupt+0x85/0xa0
  </IRQ>

Tested on RTL8187BvE device.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c1db52b ("rtl8187: Use usb anchor facilities to manage urbs")
Signed-off-by: Daniil Dulov <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai pushed a commit that referenced this pull request Aug 17, 2025
[ Upstream commit a509a55 ]

As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20, ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/[email protected]

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 17, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
The sg2042 SoCs support xtheadvector [1] so it can be included in the
devicetree. Also include vlenb for the cpu. And set vlenb=16 [2].

This can be tested by passing the "mitigations=off" kernel parameter.

Link: https://lore.kernel.org/linux-riscv/[email protected]/ [1]
Link: https://lore.kernel.org/linux-riscv/aCO44SAoS2kIP61r@ghost/ [2]

Signed-off-by: Han Gao <[email protected]>
Reviewed-by: Inochi Amaoto <[email protected]>
Reviewed-by: Nutty Liu <[email protected]>
Reviewed-by: Chen Wang <[email protected]>
Link: https://lore.kernel.org/r/915bef0530dee6c8bc0ae473837a4bd6786fa4fb.1751698574.git.rabenda.cn@gmail.com
Signed-off-by: Inochi Amaoto <[email protected]>
Signed-off-by: Chen Wang <[email protected]>
Signed-off-by: Chen Wang <[email protected]>
(cherry picked from commit a5fb905)
Signed-off-by: Han Gao <[email protected]>
sycamoremoon and others added 6 commits August 17, 2025 13:36
Enable SPI NOR node for SG2042_EVB_V2 device tree

Signed-off-by: Zixian Zeng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Han Gao <[email protected]>
Add support for PCIe controller in SG2042 SoC. The controller
uses the Cadence PCIe core programmed by pcie-cadence*.c. The
PCIe controller will work in host mode only.

Signed-off-by: Chen Wang <[email protected]>
Link: https://lore.kernel.org/r/ddedd8f76f83fea2c6d3887132d2fe6f2a6a02c1.1736923025.git.unicorn_wang@outlook.com
Signed-off-by: Han Gao <[email protected]>
Document SOPHGO SG2042 compatible for PCIe control registers.
These registers are shared by PCIe controller nodes.

Signed-off-by: Chen Wang <[email protected]>
Acked-by: Rob Herring (Arm) <[email protected]>
Link: https://lore.kernel.org/r/a9b213536c5bbc20de649afae69d2898a75924e4.1736923025.git.unicorn_wang@outlook.com
Signed-off-by: Han Gao <[email protected]>
Add PCIe controller nodes in DTS for Sophgo SG2042.
Default they are disabled.

Signed-off-by: Chen Wang <[email protected]>
Link: https://lore.kernel.org/r/4a1f23e5426bfb56cad9c07f90d4efaad5eab976.1736923025.git.unicorn_wang@outlook.com
Signed-off-by: Han Gao <[email protected]>
@RevySR RevySR force-pushed the aosc/sg2042/v6.16.y branch from b248cba to 55e9c3d Compare August 17, 2025 05:36
@RevySR RevySR changed the base branch from aosc/v6.16 to aosc/v6.16.1 August 17, 2025 05:37
@RevySR RevySR force-pushed the aosc/sg2042/v6.16.y branch from 55e9c3d to 920e7dd Compare August 17, 2025 14:03
@MingcongBai MingcongBai merged commit ddc8ea0 into AOSC-Tracking:aosc/v6.16.1 Aug 17, 2025
MingcongBai added a commit that referenced this pull request Aug 17, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mingcong Bai <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 18, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mingcong Bai <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 19, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 26, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mingcong Bai <[email protected]>
MingcongBai pushed a commit that referenced this pull request Aug 29, 2025
[ Upstream commit 32ca245 ]

Jann Horn reported a use-after-free in unix_stream_read_generic().

The following sequences reproduce the issue:

  $ python3
  from socket import *
  s1, s2 = socketpair(AF_UNIX, SOCK_STREAM)
  s1.send(b'x', MSG_OOB)
  s2.recv(1, MSG_OOB)     # leave a consumed OOB skb
  s1.send(b'y', MSG_OOB)
  s2.recv(1, MSG_OOB)     # leave a consumed OOB skb
  s1.send(b'z', MSG_OOB)
  s2.recv(1)              # recv 'z' illegally
  s2.recv(1, MSG_OOB)     # access 'z' skb (use-after-free)

Even though a user reads OOB data, the skb holding the data stays on
the recv queue to mark the OOB boundary and break the next recv().

After the last send() in the scenario above, the sk2's recv queue has
2 leading consumed OOB skbs and 1 real OOB skb.

Then, the following happens during the next recv() without MSG_OOB

  1. unix_stream_read_generic() peeks the first consumed OOB skb
  2. manage_oob() returns the next consumed OOB skb
  3. unix_stream_read_generic() fetches the next not-yet-consumed OOB skb
  4. unix_stream_read_generic() reads and frees the OOB skb

, and the last recv(MSG_OOB) triggers KASAN splat.

The 3. above occurs because of the SO_PEEK_OFF code, which does not
expect unix_skb_len(skb) to be 0, but this is true for such consumed
OOB skbs.

  while (skip >= unix_skb_len(skb)) {
    skip -= unix_skb_len(skb);
    skb = skb_peek_next(skb, &sk->sk_receive_queue);
    ...
  }

In addition to this use-after-free, there is another issue that
ioctl(SIOCATMARK) does not function properly with consecutive consumed
OOB skbs.

So, nothing good comes out of such a situation.

Instead of complicating manage_oob(), ioctl() handling, and the next
ECONNRESET fix by introducing a loop for consecutive consumed OOB skbs,
let's not leave such consecutive OOB unnecessarily.

Now, while receiving an OOB skb in unix_stream_recv_urg(), if its
previous skb is a consumed OOB skb, it is freed.

[0]:
BUG: KASAN: slab-use-after-free in unix_stream_read_actor (net/unix/af_unix.c:3027)
Read of size 4 at addr ffff888106ef2904 by task python3/315

CPU: 2 UID: 0 PID: 315 Comm: python3 Not tainted 6.16.0-rc1-00407-gec315832f6f9 #8 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl (lib/dump_stack.c:122)
 print_report (mm/kasan/report.c:409 mm/kasan/report.c:521)
 kasan_report (mm/kasan/report.c:636)
 unix_stream_read_actor (net/unix/af_unix.c:3027)
 unix_stream_read_generic (net/unix/af_unix.c:2708 net/unix/af_unix.c:2847)
 unix_stream_recvmsg (net/unix/af_unix.c:3048)
 sock_recvmsg (net/socket.c:1063 (discriminator 20) net/socket.c:1085 (discriminator 20))
 __sys_recvfrom (net/socket.c:2278)
 __x64_sys_recvfrom (net/socket.c:2291 (discriminator 1) net/socket.c:2287 (discriminator 1) net/socket.c:2287 (discriminator 1))
 do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f8911fcea06
Code: 5d e8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 19 83 e2 39 83 fa 08 75 11 e8 26 ff ff ff 66 0f 1f 44 00 00 48 8b 45 10 0f 05 <48> 8b 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08
RSP: 002b:00007fffdb0dccb0 EFLAGS: 00000202 ORIG_RAX: 000000000000002d
RAX: ffffffffffffffda RBX: 00007fffdb0dcdc8 RCX: 00007f8911fcea06
RDX: 0000000000000001 RSI: 00007f8911a5e060 RDI: 0000000000000006
RBP: 00007fffdb0dccd0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000202 R12: 00007f89119a7d20
R13: ffffffffc4653600 R14: 0000000000000000 R15: 0000000000000000
 </TASK>

Allocated by task 315:
 kasan_save_stack (mm/kasan/common.c:48)
 kasan_save_track (mm/kasan/common.c:60 (discriminator 1) mm/kasan/common.c:69 (discriminator 1))
 __kasan_slab_alloc (mm/kasan/common.c:348)
 kmem_cache_alloc_node_noprof (./include/linux/kasan.h:250 mm/slub.c:4148 mm/slub.c:4197 mm/slub.c:4249)
 __alloc_skb (net/core/skbuff.c:660 (discriminator 4))
 alloc_skb_with_frags (./include/linux/skbuff.h:1336 net/core/skbuff.c:6668)
 sock_alloc_send_pskb (net/core/sock.c:2993)
 unix_stream_sendmsg (./include/net/sock.h:1847 net/unix/af_unix.c:2256 net/unix/af_unix.c:2418)
 __sys_sendto (net/socket.c:712 (discriminator 20) net/socket.c:727 (discriminator 20) net/socket.c:2226 (discriminator 20))
 __x64_sys_sendto (net/socket.c:2233 (discriminator 1) net/socket.c:2229 (discriminator 1) net/socket.c:2229 (discriminator 1))
 do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

Freed by task 315:
 kasan_save_stack (mm/kasan/common.c:48)
 kasan_save_track (mm/kasan/common.c:60 (discriminator 1) mm/kasan/common.c:69 (discriminator 1))
 kasan_save_free_info (mm/kasan/generic.c:579 (discriminator 1))
 __kasan_slab_free (mm/kasan/common.c:271)
 kmem_cache_free (mm/slub.c:4643 (discriminator 3) mm/slub.c:4745 (discriminator 3))
 unix_stream_read_generic (net/unix/af_unix.c:3010)
 unix_stream_recvmsg (net/unix/af_unix.c:3048)
 sock_recvmsg (net/socket.c:1063 (discriminator 20) net/socket.c:1085 (discriminator 20))
 __sys_recvfrom (net/socket.c:2278)
 __x64_sys_recvfrom (net/socket.c:2291 (discriminator 1) net/socket.c:2287 (discriminator 1) net/socket.c:2287 (discriminator 1))
 do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

The buggy address belongs to the object at ffff888106ef28c0
 which belongs to the cache skbuff_head_cache of size 224
The buggy address is located 68 bytes inside of
 freed 224-byte region [ffff888106ef28c0, ffff888106ef29a0)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff888106ef3cc0 pfn:0x106ef2
head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x200000000000040(head|node=0|zone=2)
page_type: f5(slab)
raw: 0200000000000040 ffff8881001d28c0 ffffea000422fe00 0000000000000004
raw: ffff888106ef3cc0 0000000080190010 00000000f5000000 0000000000000000
head: 0200000000000040 ffff8881001d28c0 ffffea000422fe00 0000000000000004
head: ffff888106ef3cc0 0000000080190010 00000000f5000000 0000000000000000
head: 0200000000000001 ffffea00041bbc81 00000000ffffffff 00000000ffffffff
head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888106ef2800: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
 ffff888106ef2880: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
>ffff888106ef2900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888106ef2980: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888106ef2a00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 314001f ("af_unix: Add OOB support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Reviewed-by: Jann Horn <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai pushed a commit that referenced this pull request Aug 29, 2025
[ Upstream commit 2d72afb ]

A crash in conntrack was reported while trying to unlink the conntrack
entry from the hash bucket list:
    [exception RIP: __nf_ct_delete_from_lists+172]
    [..]
 #7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack]
 #8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack]
 #9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack]
    [..]

The nf_conn struct is marked as allocated from slab but appears to be in
a partially initialised state:

 ct hlist pointer is garbage; looks like the ct hash value
 (hence crash).
 ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected
 ct->timeout is 30000 (=30s), which is unexpected.

Everything else looks like normal udp conntrack entry.  If we ignore
ct->status and pretend its 0, the entry matches those that are newly
allocated but not yet inserted into the hash:
  - ct hlist pointers are overloaded and store/cache the raw tuple hash
  - ct->timeout matches the relative time expected for a new udp flow
    rather than the absolute 'jiffies' value.

If it were not for the presence of IPS_CONFIRMED,
__nf_conntrack_find_get() would have skipped the entry.

Theory is that we did hit following race:

cpu x 			cpu y			cpu z
 found entry E		found entry E
 E is expired		<preemption>
 nf_ct_delete()
 return E to rcu slab
					init_conntrack
					E is re-inited,
					ct->status set to 0
					reply tuplehash hnnode.pprev
					stores hash value.

cpu y found E right before it was deleted on cpu x.
E is now re-inited on cpu z.  cpu y was preempted before
checking for expiry and/or confirm bit.

					->refcnt set to 1
					E now owned by skb
					->timeout set to 30000

If cpu y were to resume now, it would observe E as
expired but would skip E due to missing CONFIRMED bit.

					nf_conntrack_confirm gets called
					sets: ct->status |= CONFIRMED
					This is wrong: E is not yet added
					to hashtable.

cpu y resumes, it observes E as expired but CONFIRMED:
			<resumes>
			nf_ct_expired()
			 -> yes (ct->timeout is 30s)
			confirmed bit set.

cpu y will try to delete E from the hashtable:
			nf_ct_delete() -> set DYING bit
			__nf_ct_delete_from_lists

Even this scenario doesn't guarantee a crash:
cpu z still holds the table bucket lock(s) so y blocks:

			wait for spinlock held by z

					CONFIRMED is set but there is no
					guarantee ct will be added to hash:
					"chaintoolong" or "clash resolution"
					logic both skip the insert step.
					reply hnnode.pprev still stores the
					hash value.

					unlocks spinlock
					return NF_DROP
			<unblocks, then
			 crashes on hlist_nulls_del_rcu pprev>

In case CPU z does insert the entry into the hashtable, cpu y will unlink
E again right away but no crash occurs.

Without 'cpu y' race, 'garbage' hlist is of no consequence:
ct refcnt remains at 1, eventually skb will be free'd and E gets
destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy.

To resolve this, move the IPS_CONFIRMED assignment after the table
insertion but before the unlock.

Pablo points out that the confirm-bit-store could be reordered to happen
before hlist add resp. the timeout fixup, so switch to set_bit and
before_atomic memory barrier to prevent this.

It doesn't matter if other CPUs can observe a newly inserted entry right
before the CONFIRMED bit was set:

Such event cannot be distinguished from above "E is the old incarnation"
case: the entry will be skipped.

Also change nf_ct_should_gc() to first check the confirmed bit.

The gc sequence is:
 1. Check if entry has expired, if not skip to next entry
 2. Obtain a reference to the expired entry.
 3. Call nf_ct_should_gc() to double-check step 1.

nf_ct_should_gc() is thus called only for entries that already failed an
expiry check. After this patch, once the confirmed bit check passes
ct->timeout has been altered to reflect the absolute 'best before' date
instead of a relative time.  Step 3 will therefore not remove the entry.

Without this change to nf_ct_should_gc() we could still get this sequence:

 1. Check if entry has expired.
 2. Obtain a reference.
 3. Call nf_ct_should_gc() to double-check step 1:
    4 - entry is still observed as expired
    5 - meanwhile, ct->timeout is corrected to absolute value on other CPU
      and confirm bit gets set
    6 - confirm bit is seen
    7 - valid entry is removed again

First do check 6), then 4) so the gc expiry check always picks up either
confirmed bit unset (entry gets skipped) or expiry re-check failure for
re-inited conntrack objects.

This change cannot be backported to releases before 5.19. Without
commit 8a75a2c ("netfilter: conntrack: remove unconfirmed list")
|= IPS_CONFIRMED line cannot be moved without further changes.

Cc: Razvan Cojocaru <[email protected]>
Link: https://lore.kernel.org/netfilter-devel/[email protected]/
Link: https://lore.kernel.org/netfilter-devel/[email protected]/
Fixes: 1397af5 ("netfilter: conntrack: remove the percpu dying list")
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai pushed a commit that referenced this pull request Aug 29, 2025
[ Upstream commit 16d8fd7 ]

In rtl8187_stop() move the call of usb_kill_anchored_urbs() before clearing
b_tx_status.queue. This change prevents callbacks from using already freed
skb due to anchor was not killed before freeing such skb.

 BUG: kernel NULL pointer dereference, address: 0000000000000080
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.15.0 #8 PREEMPT(voluntary)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
 RIP: 0010:ieee80211_tx_status_irqsafe+0x21/0xc0 [mac80211]
 Call Trace:
  <IRQ>
  rtl8187_tx_cb+0x116/0x150 [rtl8187]
  __usb_hcd_giveback_urb+0x9d/0x120
  usb_giveback_urb_bh+0xbb/0x140
  process_one_work+0x19b/0x3c0
  bh_worker+0x1a7/0x210
  tasklet_action+0x10/0x30
  handle_softirqs+0xf0/0x340
  __irq_exit_rcu+0xcd/0xf0
  common_interrupt+0x85/0xa0
  </IRQ>

Tested on RTL8187BvE device.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c1db52b ("rtl8187: Use usb anchor facilities to manage urbs")
Signed-off-by: Daniil Dulov <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai pushed a commit that referenced this pull request Aug 29, 2025
[ Upstream commit a509a55 ]

As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20, ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/[email protected]

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 29, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mingcong Bai <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 29, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Aug 30, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 4, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mingcong Bai <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 9, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 9, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mingcong Bai <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 10, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 11, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 12, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 22, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Wenbin Fang <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Jianfeng Liu <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Kexy Biscuit <[email protected]>
MingcongBai added a commit that referenced this pull request Sep 22, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: [email protected]
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <[email protected]>
Tested-by: Haien Liang <[email protected]>
Tested-by: Shirong Liu <[email protected]>
Tested-by: Haofeng Wu <[email protected]>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <[email protected]>
Signed-off-by: Shang Yatsen <[email protected]>
Signed-off-by: Mingcong Bai <[email protected]>

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mingcong Bai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants