forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 40
Linux 6.6 mchp+fpga #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
harihanv
wants to merge
10,000
commits into
linux4microchip:linux-6.1-mchp
from
harihanv:linux-6.6-mchp+fpga
Closed
Linux 6.6 mchp+fpga #2
harihanv
wants to merge
10,000
commits into
linux4microchip:linux-6.1-mchp
from
harihanv:linux-6.6-mchp+fpga
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ Upstream commit bec7760 ] The amplifier doesn't loose register state in software shutdown mode, so there is no need to reset the cur_* values. Without these resets, the amplifier can be turned on after runtime_suspend without waiting for the program and profile to be restored. Fixes: 5be27f1 ("ALSA: hda/tas2781: Add tas2781 HDA driver") Signed-off-by: Gergo Koteles <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Message-ID: <aa27ae084150988bf6a0ead7e3403bc485d790f8.1709918447.git.soyer@irl.hu> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 76f5f55 ] Make calibration functions configurable to support different calibration data storage modes. Signed-off-by: Gergo Koteles <[email protected]> Link: https://lore.kernel.org/r/5859c77ffef752b8a9784713b412d815d7e2688c.1703891777.git.soyer@irl.hu Signed-off-by: Takashi Iwai <[email protected]> Stable-dep-of: 5f51de7 ("ALSA: hda/tas2781: do not call pm_runtime_force_* in system_resume/suspend") Signed-off-by: Sasha Levin <[email protected]>
…spend [ Upstream commit 5f51de7 ] The runtime_resume function calls prmg_load and apply_calibration functions, but system_resume also calls them, so calling pm_runtime_force_resume before reset is unnecessary. For consistency, do not call the pm_runtime_force_suspend in system_suspend, as runtime_suspend does the same. Fixes: 5be27f1 ("ALSA: hda/tas2781: Add tas2781 HDA driver") Signed-off-by: Gergo Koteles <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Message-ID: <d0b4cc1248b9d375d59c009563da42d60d69eac3.1709918447.git.soyer@irl.hu> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 68f7f3f ] Make the amp available immediately after a module load to avoid having to wait for a PCM hook action. (eg. unloading & loading the module while listening music) Signed-off-by: Gergo Koteles <[email protected]> Link: https://lore.kernel.org/r/7f2f65d9212aa16edd4db8725489ae59dbe74c66.1703895108.git.soyer@irl.hu Signed-off-by: Takashi Iwai <[email protected]> Stable-dep-of: 9fc91a6 ("ALSA: hda/tas2781: restore power state after system_resume") Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 9fc91a6 ] After system_resume the amplifers will remain off, even if they were on before system_suspend. Use playback_started bool to save the playback state, and restore power state based on it. Fixes: 5be27f1 ("ALSA: hda/tas2781: Add tas2781 HDA driver") Signed-off-by: Gergo Koteles <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Message-ID: <1742b61901781826f6e6212ffe1d21af542d134a.1709918447.git.soyer@irl.hu> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit a39d51f ] If a usb audio device sets more bits than the amount of channels it could write outside of the map array. Signed-off-by: Johan Carlsson <[email protected]> Fixes: 04324cc ("ALSA: usb-audio: add channel map support") Message-ID: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 926e8ea ] Remove the unneeded assignment of the qp_num which is already set in irdma_create_qp(). Fixes: b48c24c ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <[email protected]> Signed-off-by: Shiraz Saleem <[email protected]> Signed-off-by: Sindhu Devale <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
…etup [ Upstream commit c21a887 ] Upon rare occasions, KASAN reports a use-after-free Write in srpt_refresh_port(). This seems to be because an event handler is registered before the srpt device is fully setup and a race condition upon error may leave a partially setup event handler in place. Instead, only register the event handler after srpt device initialization is complete. Fixes: a42d985 ("ib_srpt: Initial SRP Target merge for v3.3-rc1") Signed-off-by: William Kucharski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 8a430dd ] If data block in compressed cluster is not persisted with metadata during checkpoint, after SPOR, the data may be corrupted, let's guarantee to write compressed page by checkpoint. Fixes: 4c8ff70 ("f2fs: support data compression") Reviewed-by: Daeho Jeong <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit fd24452 ] When we overwrite compressed cluster w/ normal cluster, we should not unlock cp_rwsem during f2fs_write_raw_pages(), otherwise data will be corrupted if partial blocks were persisted before CP & SPOR, due to cluster metadata wasn't updated atomically. Fixes: 4c8ff70 ("f2fs: support data compression") Reviewed-by: Daeho Jeong <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit eb8fbaa ] Compressed cluster may not be released due to we can fail in release_compress_blocks(), fix to handle reserved compressed cluster correctly in reserve_compress_blocks(). Fixes: 4c8ff70 ("f2fs: support data compression") Signed-off-by: Sheng Yong <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit a539363 ] Commit 3c6c2be ("f2fs: avoid punch_hole overhead when releasing volatile data") introduced FI_FIRST_BLOCK_WRITTEN as below reason: This patch is to avoid some punch_hole overhead when releasing volatile data. If volatile data was not written yet, we just can make the first page as zero. After commit 7bc155f ("f2fs: kill volatile write support"), we won't support volatile write, but it missed to remove obsolete FI_FIRST_BLOCK_WRITTEN, delete it in this patch. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Stable-dep-of: 5460749 ("f2fs: compress: fix to avoid inconsistence bewteen i_blocks and dnode") Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit bb6e1c8 ] FI_DROP_CACHE was introduced in commit 1e84371 ("f2fs: change atomic and volatile write policies") for volatile write feature, after commit 7bc155f ("f2fs: kill volatile write support"), we won't support volatile write, let's delete related codes. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Stable-dep-of: 5460749 ("f2fs: compress: fix to avoid inconsistence bewteen i_blocks and dnode") Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 2020cd4 ] Just cleanup, no logic changes. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Stable-dep-of: 5460749 ("f2fs: compress: fix to avoid inconsistence bewteen i_blocks and dnode") Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 59d0d4c ] This patch allows caller to pass blkaddr to f2fs_set_data_blkaddr() and let __set_data_blkaddr() inside f2fs_set_data_blkaddr() to update dn->data_blkaddr w/ last value of blkaddr. Just cleanup, no logic changes. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Stable-dep-of: 5460749 ("f2fs: compress: fix to avoid inconsistence bewteen i_blocks and dnode") Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 5460749 ] In reserve_compress_blocks(), we update blkaddrs of dnode in prior to inc_valid_block_count(), it may cause inconsistent status bewteen i_blocks and blkaddrs once inc_valid_block_count() fails. To fix this issue, it needs to reverse their invoking order. Fixes: c75488f ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Reviewed-by: Daeho Jeong <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit b896e30 ] verify_blkaddr() will trigger panic once we inject fault into f2fs_is_valid_blkaddr(), fix to remove this unnecessary f2fs_bug_on(). Fixes: 18792e6 ("f2fs: support fault injection for f2fs_is_valid_blkaddr()") Reviewed-by: Daeho Jeong <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 536af82 ] It needs to check last zone_pending_bio and wait IO completion before traverse next fio in io->io_list, otherwise, bio in next zone may be submitted before all IO completion in current zone. Fixes: e067dc3 ("f2fs: maintain six open zones for zoned devices") Cc: Daeho Jeong <[email protected]> Signed-off-by: Chao Yu <[email protected]> Reviewed-by: Daeho Jeong <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit c2034ef ] BUG: kernel NULL pointer dereference, address: 0000000000000014 RIP: 0010:f2fs_submit_page_write+0x6cf/0x780 [f2fs] Call Trace: <TASK> ? show_regs+0x6e/0x80 ? __die+0x29/0x70 ? page_fault_oops+0x154/0x4a0 ? prb_read_valid+0x20/0x30 ? __irq_work_queue_local+0x39/0xd0 ? irq_work_queue+0x36/0x70 ? do_user_addr_fault+0x314/0x6c0 ? exc_page_fault+0x7d/0x190 ? asm_exc_page_fault+0x2b/0x30 ? f2fs_submit_page_write+0x6cf/0x780 [f2fs] ? f2fs_submit_page_write+0x736/0x780 [f2fs] do_write_page+0x50/0x170 [f2fs] f2fs_outplace_write_data+0x61/0xb0 [f2fs] f2fs_do_write_data_page+0x3f8/0x660 [f2fs] f2fs_write_single_data_page+0x5bb/0x7a0 [f2fs] f2fs_write_cache_pages+0x3da/0xbe0 [f2fs] ... It is possible that other threads have added this fio to io->bio and submitted the io->bio before entering f2fs_submit_page_write(). At this point io->bio = NULL. If is_end_zone_blkaddr(sbi, fio->new_blkaddr) of this fio is true, then an NULL pointer dereference error occurs at bio_get(io->bio). The original code for determining zone end was after "out:", which would have missed some fio who is zone end. I've moved this code before "skip:" to make sure it's done for each fio. Fixes: e067dc3 ("f2fs: maintain six open zones for zoned devices") Signed-off-by: Wenjie Qi <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 2f9420d ] - f2fs_disable_compressed_file - check inode_has_data - f2fs_file_mmap - mkwrite - f2fs_get_block_locked : update metadata in compressed inode's disk layout - fi->i_flags &= ~F2FS_COMPR_FL - clear_inode_flag(inode, FI_COMPRESSED_FILE); we should use i_sem lock to prevent above race case. Fixes: 4c8ff70 ("f2fs: support data compression") Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 21ec682 ] During recovery, if FAULT_BLOCK is on, it is possible that f2fs_reserve_new_block() will return -ENOSPC during recovery, then it may trigger panic. Also, if fault injection rate is 1 and only FAULT_BLOCK fault type is on, it may encounter deadloop in loop of block reservation. Let's change as below to fix these issues: - remove bug_on() to avoid panic. - limit the loop count of block reservation to avoid potential deadloop. Fixes: 956fa1d ("f2fs: fix to check return value of f2fs_reserve_new_block()") Reported-by: Zhiguo Niu <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 9f3dbcb ] csiostor uses function pointer casts to keep the csio_ln_ev state machine hidden, but this causes warnings about control flow integrity (KCFI) violations in clang-16 and higher: drivers/scsi/csiostor/csio_lnode.c:1098:33: error: cast from 'void (*)(struct csio_lnode *, enum csio_ln_ev)' to 'csio_sm_state_t' (aka 'void (*)(void *, unsigned int)') converts to incompatible function type [-Werror,-Wcast-function-type-strict] 1098 | return (csio_get_state(ln) == ((csio_sm_state_t)csio_lns_ready)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/csiostor/csio_lnode.c:1369:29: error: cast from 'void (*)(struct csio_lnode *, enum csio_ln_ev)' to 'csio_sm_state_t' (aka 'void (*)(void *, unsigned int)') converts to incompatible function type [-Werror,-Wcast-function-type-strict] 1369 | if (csio_get_state(ln) == ((csio_sm_state_t)csio_lns_uninit)) { | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/csiostor/csio_lnode.c:1373:29: error: cast from 'void (*)(struct csio_lnode *, enum csio_ln_ev)' to 'csio_sm_state_t' (aka 'void (*)(void *, unsigned int)') converts to incompatible function type [-Werror,-Wcast-function-type-strict] 1373 | if (csio_get_state(ln) == ((csio_sm_state_t)csio_lns_ready)) { | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/csiostor/csio_lnode.c:1377:29: error: cast from 'void (*)(struct csio_lnode *, enum csio_ln_ev)' to 'csio_sm_state_t' (aka 'void (*)(void *, unsigned int)') converts to incompatible function type [-Werror,-Wcast-function-type-strict] 1377 | if (csio_get_state(ln) == ((csio_sm_state_t)csio_lns_offline)) { | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Move the enum into a shared header so the correct types can be used without the need for casts. Fixes: a3667aa ("[SCSI] csiostor: Chelsio FCoE offload driver") Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 1020139 ] Disable IBI IRQ signal and status only when hot-join and SIR enabling of all target devices attached to the bus are disabled. Fixes: e389b1d ("i3c: dw: Add support for in-band interrupts") Signed-off-by: Dylan Hung <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit d20a7cf ] Commit 27c5fd2 ("RDMA/hns: The UD mode can only be configured with DCQCN") adds a check of congest control alorithm for UD. But that patch causes a problem: hr_dev->caps.congest_type is global, used by all QPs, so modifying this field to DCQCN for UD QPs causes other QPs unable to use any other algorithm except DCQCN. Revert the modification in commit 27c5fd2 ("RDMA/hns: The UD mode can only be configured with DCQCN"). Add a new field cong_type to struct hns_roce_qp and configure DCQCN for UD QPs. Fixes: 27c5fd2 ("RDMA/hns: The UD mode can only be configured with DCQCN") Fixes: f91696f ("RDMA/hns: Support congestion control type selection according to the FW") Signed-off-by: Luoyouming <[email protected]> Signed-off-by: Junxian Huang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7a8bccd ] The mad_client will be initialized in enable_device_and_get(), while the devices_rwsem will be downgraded to a read semaphore. There is a window that leads to the failed initialization for cm_client, since it can not get matched mad port from ib_mad_port_list, and the matched mad port will be added to the list after that. mad_client | cm_client ------------------|-------------------------------------------------------- ib_register_device| enable_device_and_get down_write(&devices_rwsem) xa_set_mark(&devices, DEVICE_REGISTERED) downgrade_write(&devices_rwsem) | |ib_cm_init |ib_register_client(&cm_client) |down_read(&devices_rwsem) |xa_for_each_marked (&devices, DEVICE_REGISTERED) |add_client_context |cm_add_one |ib_register_mad_agent |ib_get_mad_port |__ib_get_mad_port |list_for_each_entry(entry, &ib_mad_port_list, port_list) |return NULL |up_read(&devices_rwsem) | add_client_context| ib_mad_init_device| ib_mad_port_open | list_add_tail(&port_priv->port_list, &ib_mad_port_list) up_read(&devices_rwsem) | Fix it by using down_write(&devices_rwsem) in ib_register_client(). Fixes: d089989 ("RDMA/device: Provide APIs from the core code to help unregistration") Link: https://lore.kernel.org/r/[email protected] Suggested-by: Jason Gunthorpe <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7a7b7f5 ] strnlen() may return 0 (e.g. for "\0\n" string), it's better to check the result of strnlen() before using 'len - 1' expression for the 'buf' array index. Detected using the static analysis tool - Svace. Fixes: dc3b66a ("RDMA/rtrs-clt: Add a minimum latency multipath policy") Signed-off-by: Alexey Kodanev <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Jack Wang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit b696002 ] Some callback functions used here take a boolean argument, others take a status argument. This breaks KCFI type checking, so clang now warns about the function pointer cast: drivers/scsi/bfa/bfad_bsg.c:2138:29: error: cast from 'void (*)(void *, enum bfa_status)' to 'bfa_cb_cbfn_t' (aka 'void (*)(void *, enum bfa_boolean)') converts to incompatible function type [-Werror,-Wcast-function-type-strict] Assuming the code is actually correct here and the callers always match the argument types of the callee, rework this to replace the explicit cast with a union of the two pointer types. This does not change the behavior of the code, so if something is actually broken here, a larger rework may be necessary. Fixes: 37ea055 ("[SCSI] bfa: Added support to collect and reset fcport stats") Fixes: 3ec4f2c ("[SCSI] bfa: Added support to configure QOS and collect stats.") Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 40b2d55 ] #generic/700 - output mismatch (see /media/fstests/results//generic/700.out.bad) # --- tests/generic/700.out 2023-03-28 10:40:42.735529223 +0000 # +++ /media/fstests/results//generic/700.out.bad 2024-02-06 04:37:56.000000000 +0000 # @@ -1,2 +1,4 @@ # QA output created by 700 # +/mnt/scratch_f2fs/f1: security.selinux: No such attribute # +/mnt/scratch_f2fs/f2: security.selinux: No such attribute # Silence is golden # ... # (Run 'diff -u /media/fstests/tests/generic/700.out /media/fstests/results//generic/700.out.bad' to see the entire diff) HINT: You _MAY_ be missing kernel fix: 70b589a xfs: add selinux labels to whiteout inodes Previously, it missed to create selinux labels during whiteout inode initialization, fix this issue. Fixes: 7e01e7a ("f2fs: support RENAME_WHITEOUT") Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
…ption [ Upstream commit e39602d ] f2fs only support to config zstd compress level w/ a positive number due to layout design, but since commit e0c1b49 ("lib: zstd: Upgrade to latest upstream zstd version 1.4.10"), zstd supports negative compress level, so that zstd_min_clevel() may return a negative number, then w/ below mount option, .compress_level can be configed w/ a negative number, which is not allowed to f2fs, let's add check condition to avoid it. mount -o compress_algorithm=zstd:4294967295 /dev/sdx /mnt/f2fs Fixes: 00e120b ("f2fs: assign default compression level") Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit d6f4de7 ] The intent is to check if the strings' are truncated or not. So, >= should be used instead of >, because strlcat() and snprintf() return the length of the output, excluding the trailing NULL. Fixes: a02d692 ("SUNRPC: Provide functions for managing universal addresses") Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Cristian Birsan <[email protected]>
On SoCs, like the SAM9X75, which embed the XLCDC ip, the registers that configure the unified scaling engine were not filled with proper values. Indeed, for YCbCr formats, the VXSCFACT bitfield of the HEOCFG25 register and the HXSCFACT bitfield of the HEOCFG27 register were incorrect. For 4:2:0 formats, both vertical and horizontal factors for chroma chanels should be divided by 2 from the factors for the luma channel. Hence: HEOCFG24.VXSYFACT = VFACTOR HEOCFG25.VSXCFACT = VFACTOR / 2 HEOCFG26.HXSYFACT = HFACTOR HEOCFG27.HXSCFACT = HFACTOR / 2 However, for 4:2:2 formats, only the horizontal factor for chroma chanels should be divided by 2 from the factor for the luma channel; the vertical factor is the same for all the luma and chroma channels. Hence: HEOCFG24.VXSYFACT = VFACTOR HEOCFG25.VXSCFACT = VFACTOR HEOCFG26.HXSYFACT = HFACTOR HEOCFG27.HXSCFACT = HFACTOR / 2 Signed-off-by: Cyrille Pitchen <[email protected]>
Signed-off-by: Cristian Birsan <[email protected]>
Remove the legacy gpio apis and use gpio descriptor based apis to request and read the value of lane config gpio Signed-off-by: Balamanikandan Gunasundar <[email protected]>
Add ADC trigger support to the sam9x75_curiosity board. Signed-off-by: Durai Manickam KR <[email protected]>
Signed-off-by: Conor Dooley <[email protected]>
…+fpga Signed-off-by: Conor Dooley <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Sep 27, 2024
[ Upstream commit b313a8c ] Lockdep reported a warning in Linux version 6.6: [ 414.344659] ================================ [ 414.345155] WARNING: inconsistent lock state [ 414.345658] 6.6.0-07439-gba2303cacfda #6 Not tainted [ 414.346221] -------------------------------- [ 414.346712] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 414.347545] kworker/u10:3/1152 [HC0[0]:SC0[0]:HE0:SE1] takes: [ 414.349245] ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0 [ 414.351204] {IN-SOFTIRQ-W} state was registered at: [ 414.351751] lock_acquire+0x18d/0x460 [ 414.352218] _raw_spin_lock_irqsave+0x39/0x60 [ 414.352769] __wake_up_common_lock+0x22/0x60 [ 414.353289] sbitmap_queue_wake_up+0x375/0x4f0 [ 414.353829] sbitmap_queue_clear+0xdd/0x270 [ 414.354338] blk_mq_put_tag+0xdf/0x170 [ 414.354807] __blk_mq_free_request+0x381/0x4d0 [ 414.355335] blk_mq_free_request+0x28b/0x3e0 [ 414.355847] __blk_mq_end_request+0x242/0xc30 [ 414.356367] scsi_end_request+0x2c1/0x830 [ 414.345155] WARNING: inconsistent lock state [ 414.345658] 6.6.0-07439-gba2303cacfda #6 Not tainted [ 414.346221] -------------------------------- [ 414.346712] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 414.347545] kworker/u10:3/1152 [HC0[0]:SC0[0]:HE0:SE1] takes: [ 414.349245] ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0 [ 414.351204] {IN-SOFTIRQ-W} state was registered at: [ 414.351751] lock_acquire+0x18d/0x460 [ 414.352218] _raw_spin_lock_irqsave+0x39/0x60 [ 414.352769] __wake_up_common_lock+0x22/0x60 [ 414.353289] sbitmap_queue_wake_up+0x375/0x4f0 [ 414.353829] sbitmap_queue_clear+0xdd/0x270 [ 414.354338] blk_mq_put_tag+0xdf/0x170 [ 414.354807] __blk_mq_free_request+0x381/0x4d0 [ 414.355335] blk_mq_free_request+0x28b/0x3e0 [ 414.355847] __blk_mq_end_request+0x242/0xc30 [ 414.356367] scsi_end_request+0x2c1/0x830 [ 414.356863] scsi_io_completion+0x177/0x1610 [ 414.357379] scsi_complete+0x12f/0x260 [ 414.357856] blk_complete_reqs+0xba/0xf0 [ 414.358338] __do_softirq+0x1b0/0x7a2 [ 414.358796] irq_exit_rcu+0x14b/0x1a0 [ 414.359262] sysvec_call_function_single+0xaf/0xc0 [ 414.359828] asm_sysvec_call_function_single+0x1a/0x20 [ 414.360426] default_idle+0x1e/0x30 [ 414.360873] default_idle_call+0x9b/0x1f0 [ 414.361390] do_idle+0x2d2/0x3e0 [ 414.361819] cpu_startup_entry+0x55/0x60 [ 414.362314] start_secondary+0x235/0x2b0 [ 414.362809] secondary_startup_64_no_verify+0x18f/0x19b [ 414.363413] irq event stamp: 428794 [ 414.363825] hardirqs last enabled at (428793): [<ffffffff816bfd1c>] ktime_get+0x1dc/0x200 [ 414.364694] hardirqs last disabled at (428794): [<ffffffff85470177>] _raw_spin_lock_irq+0x47/0x50 [ 414.365629] softirqs last enabled at (428444): [<ffffffff85474780>] __do_softirq+0x540/0x7a2 [ 414.366522] softirqs last disabled at (428419): [<ffffffff813f65ab>] irq_exit_rcu+0x14b/0x1a0 [ 414.367425] other info that might help us debug this: [ 414.368194] Possible unsafe locking scenario: [ 414.368900] CPU0 [ 414.369225] ---- [ 414.369548] lock(&sbq->ws[i].wait); [ 414.370000] <Interrupt> [ 414.370342] lock(&sbq->ws[i].wait); [ 414.370802] *** DEADLOCK *** [ 414.371569] 5 locks held by kworker/u10:3/1152: [ 414.372088] #0: ffff88810130e938 ((wq_completion)writeback){+.+.}-{0:0}, at: process_scheduled_works+0x357/0x13f0 [ 414.373180] #1: ffff88810201fdb8 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x3a3/0x13f0 [ 414.374384] #2: ffffffff86ffbdc0 (rcu_read_lock){....}-{1:2}, at: blk_mq_run_hw_queue+0x637/0xa00 [ 414.375342] #3: ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0 [ 414.376377] #4: ffff888106205a08 (&hctx->dispatch_wait_lock){+.-.}-{2:2}, at: blk_mq_dispatch_rq_list+0x1337/0x1ee0 [ 414.378607] stack backtrace: [ 414.379177] CPU: 0 PID: 1152 Comm: kworker/u10:3 Not tainted 6.6.0-07439-gba2303cacfda #6 [ 414.380032] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 414.381177] Workqueue: writeback wb_workfn (flush-253:0) [ 414.381805] Call Trace: [ 414.382136] <TASK> [ 414.382429] dump_stack_lvl+0x91/0xf0 [ 414.382884] mark_lock_irq+0xb3b/0x1260 [ 414.383367] ? __pfx_mark_lock_irq+0x10/0x10 [ 414.383889] ? stack_trace_save+0x8e/0xc0 [ 414.384373] ? __pfx_stack_trace_save+0x10/0x10 [ 414.384903] ? graph_lock+0xcf/0x410 [ 414.385350] ? save_trace+0x3d/0xc70 [ 414.385808] mark_lock.part.20+0x56d/0xa90 [ 414.386317] mark_held_locks+0xb0/0x110 [ 414.386791] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 414.387320] lockdep_hardirqs_on_prepare+0x297/0x3f0 [ 414.387901] ? _raw_spin_unlock_irq+0x28/0x50 [ 414.388422] trace_hardirqs_on+0x58/0x100 [ 414.388917] _raw_spin_unlock_irq+0x28/0x50 [ 414.389422] __blk_mq_tag_busy+0x1d6/0x2a0 [ 414.389920] __blk_mq_get_driver_tag+0x761/0x9f0 [ 414.390899] blk_mq_dispatch_rq_list+0x1780/0x1ee0 [ 414.391473] ? __pfx_blk_mq_dispatch_rq_list+0x10/0x10 [ 414.392070] ? sbitmap_get+0x2b8/0x450 [ 414.392533] ? __blk_mq_get_driver_tag+0x210/0x9f0 [ 414.393095] __blk_mq_sched_dispatch_requests+0xd99/0x1690 [ 414.393730] ? elv_attempt_insert_merge+0x1b1/0x420 [ 414.394302] ? __pfx___blk_mq_sched_dispatch_requests+0x10/0x10 [ 414.394970] ? lock_acquire+0x18d/0x460 [ 414.395456] ? blk_mq_run_hw_queue+0x637/0xa00 [ 414.395986] ? __pfx_lock_acquire+0x10/0x10 [ 414.396499] blk_mq_sched_dispatch_requests+0x109/0x190 [ 414.397100] blk_mq_run_hw_queue+0x66e/0xa00 [ 414.397616] blk_mq_flush_plug_list.part.17+0x614/0x2030 [ 414.398244] ? __pfx_blk_mq_flush_plug_list.part.17+0x10/0x10 [ 414.398897] ? writeback_sb_inodes+0x241/0xcc0 [ 414.399429] blk_mq_flush_plug_list+0x65/0x80 [ 414.399957] __blk_flush_plug+0x2f1/0x530 [ 414.400458] ? __pfx___blk_flush_plug+0x10/0x10 [ 414.400999] blk_finish_plug+0x59/0xa0 [ 414.401467] wb_writeback+0x7cc/0x920 [ 414.401935] ? __pfx_wb_writeback+0x10/0x10 [ 414.402442] ? mark_held_locks+0xb0/0x110 [ 414.402931] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 414.403462] ? lockdep_hardirqs_on_prepare+0x297/0x3f0 [ 414.404062] wb_workfn+0x2b3/0xcf0 [ 414.404500] ? __pfx_wb_workfn+0x10/0x10 [ 414.404989] process_scheduled_works+0x432/0x13f0 [ 414.405546] ? __pfx_process_scheduled_works+0x10/0x10 [ 414.406139] ? do_raw_spin_lock+0x101/0x2a0 [ 414.406641] ? assign_work+0x19b/0x240 [ 414.407106] ? lock_is_held_type+0x9d/0x110 [ 414.407604] worker_thread+0x6f2/0x1160 [ 414.408075] ? __kthread_parkme+0x62/0x210 [ 414.408572] ? lockdep_hardirqs_on_prepare+0x297/0x3f0 [ 414.409168] ? __kthread_parkme+0x13c/0x210 [ 414.409678] ? __pfx_worker_thread+0x10/0x10 [ 414.410191] kthread+0x33c/0x440 [ 414.410602] ? __pfx_kthread+0x10/0x10 [ 414.411068] ret_from_fork+0x4d/0x80 [ 414.411526] ? __pfx_kthread+0x10/0x10 [ 414.411993] ret_from_fork_asm+0x1b/0x30 [ 414.412489] </TASK> When interrupt is turned on while a lock holding by spin_lock_irq it throws a warning because of potential deadlock. blk_mq_prep_dispatch_rq blk_mq_get_driver_tag __blk_mq_get_driver_tag __blk_mq_alloc_driver_tag blk_mq_tag_busy -> tag is already busy // failed to get driver tag blk_mq_mark_tag_wait spin_lock_irq(&wq->lock) -> lock A (&sbq->ws[i].wait) __add_wait_queue(wq, wait) -> wait queue active blk_mq_get_driver_tag __blk_mq_tag_busy -> 1) tag must be idle, which means there can't be inflight IO spin_lock_irq(&tags->lock) -> lock B (hctx->tags) spin_unlock_irq(&tags->lock) -> unlock B, turn on interrupt accidentally -> 2) context must be preempt by IO interrupt to trigger deadlock. As shown above, the deadlock is not possible in theory, but the warning still need to be fixed. Fix it by using spin_lock_irqsave to get lockB instead of spin_lock_irq. Fixes: 4f1731d ("blk-mq: fix potential io hang by wrong 'wake_batch'") Signed-off-by: Li Lingfeng <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Sep 27, 2024
[ Upstream commit bd44ca3 ] Currently the dma debugging code can end up indirectly calling printk under the radix_lock. This happens when a radix tree node allocation fails. This is a problem because the printk code, when used together with netconsole, can end up inside the dma debugging code while trying to transmit a message over netcons. This creates the possibility of either a circular deadlock on the same CPU, with that CPU trying to grab the radix_lock twice, or an ABBA deadlock between different CPUs, where one CPU grabs the console lock first and then waits for the radix_lock, while the other CPU is holding the radix_lock and is waiting for the console lock. The trace captured by lockdep is of the ABBA variant. -> #2 (&dma_entry_hash[i].lock){-.-.}-{2:2}: _raw_spin_lock_irqsave+0x5a/0x90 debug_dma_map_page+0x79/0x180 dma_map_page_attrs+0x1d2/0x2f0 bnxt_start_xmit+0x8c6/0x1540 netpoll_start_xmit+0x13f/0x180 netpoll_send_skb+0x20d/0x320 netpoll_send_udp+0x453/0x4a0 write_ext_msg+0x1b9/0x460 console_flush_all+0x2ff/0x5a0 console_unlock+0x55/0x180 vprintk_emit+0x2e3/0x3c0 devkmsg_emit+0x5a/0x80 devkmsg_write+0xfd/0x180 do_iter_readv_writev+0x164/0x1b0 vfs_writev+0xf9/0x2b0 do_writev+0x6d/0x110 do_syscall_64+0x80/0x150 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #0 (console_owner){-.-.}-{0:0}: __lock_acquire+0x15d1/0x31a0 lock_acquire+0xe8/0x290 console_flush_all+0x2ea/0x5a0 console_unlock+0x55/0x180 vprintk_emit+0x2e3/0x3c0 _printk+0x59/0x80 warn_alloc+0x122/0x1b0 __alloc_pages_slowpath+0x1101/0x1120 __alloc_pages+0x1eb/0x2c0 alloc_slab_page+0x5f/0x150 new_slab+0x2dc/0x4e0 ___slab_alloc+0xdcb/0x1390 kmem_cache_alloc+0x23d/0x360 radix_tree_node_alloc+0x3c/0xf0 radix_tree_insert+0xf5/0x230 add_dma_entry+0xe9/0x360 dma_map_page_attrs+0x1d2/0x2f0 __bnxt_alloc_rx_frag+0x147/0x180 bnxt_alloc_rx_data+0x79/0x160 bnxt_rx_skb+0x29/0xc0 bnxt_rx_pkt+0xe22/0x1570 __bnxt_poll_work+0x101/0x390 bnxt_poll+0x7e/0x320 __napi_poll+0x29/0x160 net_rx_action+0x1e0/0x3e0 handle_softirqs+0x190/0x510 run_ksoftirqd+0x4e/0x90 smpboot_thread_fn+0x1a8/0x270 kthread+0x102/0x120 ret_from_fork+0x2f/0x40 ret_from_fork_asm+0x11/0x20 This bug is more likely than it seems, because when one CPU has run out of memory, chances are the other has too. The good news is, this bug is hidden behind the CONFIG_DMA_API_DEBUG, so not many users are likely to trigger it. Signed-off-by: Rik van Riel <[email protected]> Reported-by: Konstantin Ovsepian <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Sep 27, 2024
[ Upstream commit 56f4526 ] This timer HW supports 8, 16 and 32-bit timer widths. This driver currently uses a u32 to store the max possible value of the timer. However, statements perform addition of 2 in xilinx_pwm_apply() when calculating the period_cycles and duty_cycles values. Since priv->max is a u32, this will result in an overflow to 1 which will not only be incorrect but fail on range comparison. This results in making it impossible to set the PWM in this timer mode. There are two obvious solutions to the current problem: 1. Cast each instance where overflow occurs to u64. 2. Change priv->max from a u32 to a u64. Solution #1 requires more code modifications, and leaves opportunity to introduce similar overflows if other math statements are added in the future. These may also go undetected if running in non 32-bit timer modes. Solution #2 is the much smaller and cleaner approach and thus the chosen method in this patch. This was tested on a Zynq UltraScale+ with multiple instances of the PWM IP. Signed-off-by: Ken Sloat <[email protected]> Reviewed-by: Michal Simek <[email protected]> Reviewed-by: Sean Anderson <[email protected]> Link: https://lore.kernel.org/r/SJ0P222MB0107490C5371B848EF04351CA1E19@SJ0P222MB0107.NAMP222.PROD.OUTLOOK.COM Signed-off-by: Michal Simek <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Sep 27, 2024
commit f9c169b upstream. If smb2_compound_op() is called with a valid @CFILE and returned -EINVAL, we need to call cifs_get_writable_path() before retrying it as the reference of @CFILE was already dropped by previous call. This fixes the following KASAN splat when running fstests generic/013 against Windows Server 2022: CIFS: Attempting to mount //w22-fs0/scratch run fstests generic/013 at 2024-09-02 19:48:59 ================================================================== BUG: KASAN: slab-use-after-free in detach_if_pending+0xab/0x200 Write of size 8 at addr ffff88811f1a3730 by task kworker/3:2/176 CPU: 3 UID: 0 PID: 176 Comm: kworker/3:2 Not tainted 6.11.0-rc6 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 Workqueue: cifsoplockd cifs_oplock_break [cifs] Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 ? detach_if_pending+0xab/0x200 print_report+0x156/0x4d9 ? detach_if_pending+0xab/0x200 ? __virt_addr_valid+0x145/0x300 ? __phys_addr+0x46/0x90 ? detach_if_pending+0xab/0x200 kasan_report+0xda/0x110 ? detach_if_pending+0xab/0x200 detach_if_pending+0xab/0x200 timer_delete+0x96/0xe0 ? __pfx_timer_delete+0x10/0x10 ? rcu_is_watching+0x20/0x50 try_to_grab_pending+0x46/0x3b0 __cancel_work+0x89/0x1b0 ? __pfx___cancel_work+0x10/0x10 ? kasan_save_track+0x14/0x30 cifs_close_deferred_file+0x110/0x2c0 [cifs] ? __pfx_cifs_close_deferred_file+0x10/0x10 [cifs] ? __pfx_down_read+0x10/0x10 cifs_oplock_break+0x4c1/0xa50 [cifs] ? __pfx_cifs_oplock_break+0x10/0x10 [cifs] ? lock_is_held_type+0x85/0xf0 ? mark_held_locks+0x1a/0x90 process_one_work+0x4c6/0x9f0 ? find_held_lock+0x8a/0xa0 ? __pfx_process_one_work+0x10/0x10 ? lock_acquired+0x220/0x550 ? __list_add_valid_or_report+0x37/0x100 worker_thread+0x2e4/0x570 ? __kthread_parkme+0xd1/0xf0 ? __pfx_worker_thread+0x10/0x10 kthread+0x17f/0x1c0 ? kthread+0xda/0x1c0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1118: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0xaa/0xb0 cifs_new_fileinfo+0xc8/0x9d0 [cifs] cifs_atomic_open+0x467/0x770 [cifs] lookup_open.isra.0+0x665/0x8b0 path_openat+0x4c3/0x1380 do_filp_open+0x167/0x270 do_sys_openat2+0x129/0x160 __x64_sys_creat+0xad/0xe0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 83: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x70 poison_slab_object+0xe9/0x160 __kasan_slab_free+0x32/0x50 kfree+0xf2/0x300 process_one_work+0x4c6/0x9f0 worker_thread+0x2e4/0x570 kthread+0x17f/0x1c0 ret_from_fork+0x31/0x60 ret_from_fork_asm+0x1a/0x30 Last potentially related work creation: kasan_save_stack+0x30/0x50 __kasan_record_aux_stack+0xad/0xc0 insert_work+0x29/0xe0 __queue_work+0x5ea/0x760 queue_work_on+0x6d/0x90 _cifsFileInfo_put+0x3f6/0x770 [cifs] smb2_compound_op+0x911/0x3940 [cifs] smb2_set_path_size+0x228/0x270 [cifs] cifs_set_file_size+0x197/0x460 [cifs] cifs_setattr+0xd9c/0x14b0 [cifs] notify_change+0x4e3/0x740 do_truncate+0xfa/0x180 vfs_truncate+0x195/0x200 __x64_sys_truncate+0x109/0x150 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 71f15c9 ("smb: client: retry compound request without reusing lease") Cc: [email protected] Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Cc: David Howells <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Sep 27, 2024
commit d260327 upstream. Chi Zhiling reported: We found a null pointer accessing in tracefs[1], the reason is that the variable 'ei_child' is set to LIST_POISON1, that means the list was removed in eventfs_remove_rec. so when access the ei_child->is_freed, the panic triggered. by the way, the following script can reproduce this panic loop1 (){ while true do echo "p:kp submit_bio" > /sys/kernel/debug/tracing/kprobe_events echo "" > /sys/kernel/debug/tracing/kprobe_events done } loop2 (){ while true do tree /sys/kernel/debug/tracing/events/kprobes/ done } loop1 & loop2 [1]: [ 1147.959632][T17331] Unable to handle kernel paging request at virtual address dead000000000150 [ 1147.968239][T17331] Mem abort info: [ 1147.971739][T17331] ESR = 0x0000000096000004 [ 1147.976172][T17331] EC = 0x25: DABT (current EL), IL = 32 bits [ 1147.982171][T17331] SET = 0, FnV = 0 [ 1147.985906][T17331] EA = 0, S1PTW = 0 [ 1147.989734][T17331] FSC = 0x04: level 0 translation fault [ 1147.995292][T17331] Data abort info: [ 1147.998858][T17331] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 1148.005023][T17331] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 1148.010759][T17331] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 1148.016752][T17331] [dead000000000150] address between user and kernel address ranges [ 1148.024571][T17331] Internal error: Oops: 0000000096000004 [#1] SMP [ 1148.030825][T17331] Modules linked in: team_mode_loadbalance team nlmon act_gact cls_flower sch_ingress bonding tls macvlan dummy ib_core bridge stp llc veth amdgpu amdxcp mfd_core gpu_sched drm_exec drm_buddy radeon crct10dif_ce video drm_suballoc_helper ghash_ce drm_ttm_helper sha2_ce ttm sha256_arm64 i2c_algo_bit sha1_ce sbsa_gwdt cp210x drm_display_helper cec sr_mod cdrom drm_kms_helper binfmt_misc sg loop fuse drm dm_mod nfnetlink ip_tables autofs4 [last unloaded: tls] [ 1148.072808][T17331] CPU: 3 PID: 17331 Comm: ls Tainted: G W ------- ---- 6.6.43 #2 [ 1148.081751][T17331] Source Version: 21b3b386e948bedd29369af66f3e98ab01b1c650 [ 1148.088783][T17331] Hardware name: Greatwall GW-001M1A-FTF/GW-001M1A-FTF, BIOS KunLun BIOS V4.0 07/16/2020 [ 1148.098419][T17331] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1148.106060][T17331] pc : eventfs_iterate+0x2c0/0x398 [ 1148.111017][T17331] lr : eventfs_iterate+0x2fc/0x398 [ 1148.115969][T17331] sp : ffff80008d56bbd0 [ 1148.119964][T17331] x29: ffff80008d56bbf0 x28: ffff001ff5be2600 x27: 0000000000000000 [ 1148.127781][T17331] x26: ffff001ff52ca4e0 x25: 0000000000009977 x24: dead000000000100 [ 1148.135598][T17331] x23: 0000000000000000 x22: 000000000000000b x21: ffff800082645f10 [ 1148.143415][T17331] x20: ffff001fddf87c70 x19: ffff80008d56bc90 x18: 0000000000000000 [ 1148.151231][T17331] x17: 0000000000000000 x16: 0000000000000000 x15: ffff001ff52ca4e0 [ 1148.159048][T17331] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 1148.166864][T17331] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000804391d0 [ 1148.174680][T17331] x8 : 0000000180000000 x7 : 0000000000000018 x6 : 0000aaab04b92862 [ 1148.182498][T17331] x5 : 0000aaab04b92862 x4 : 0000000080000000 x3 : 0000000000000068 [ 1148.190314][T17331] x2 : 000000000000000f x1 : 0000000000007ea8 x0 : 0000000000000001 [ 1148.198131][T17331] Call trace: [ 1148.201259][T17331] eventfs_iterate+0x2c0/0x398 [ 1148.205864][T17331] iterate_dir+0x98/0x188 [ 1148.210036][T17331] __arm64_sys_getdents64+0x78/0x160 [ 1148.215161][T17331] invoke_syscall+0x78/0x108 [ 1148.219593][T17331] el0_svc_common.constprop.0+0x48/0xf0 [ 1148.224977][T17331] do_el0_svc+0x24/0x38 [ 1148.228974][T17331] el0_svc+0x40/0x168 [ 1148.232798][T17331] el0t_64_sync_handler+0x120/0x130 [ 1148.237836][T17331] el0t_64_sync+0x1a4/0x1a8 [ 1148.242182][T17331] Code: 54ffff6c f9400676 910006d6 f900067 (b9405300) [ 1148.248955][T17331] ---[ end trace 0000000000000000 ]--- The issue is that list_del() is used on an SRCU protected list variable before the synchronization occurs. This can poison the list pointers while there is a reader iterating the list. This is simply fixed by using list_del_rcu() that is specifically made for this purpose. Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 43aa6f9 ("eventfs: Get rid of dentry pointers without refcounts") Reported-by: Chi Zhiling <[email protected]> Tested-by: Chi Zhiling <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Sep 27, 2024
[ Upstream commit c68bbf5 ] This adds a check before freeing the rx->skb in flush and close functions to handle the kernel crash seen while removing driver after FW download fails or before FW download completes. dmesg log: [ 54.634586] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 [ 54.643398] Mem abort info: [ 54.646204] ESR = 0x0000000096000004 [ 54.649964] EC = 0x25: DABT (current EL), IL = 32 bits [ 54.655286] SET = 0, FnV = 0 [ 54.658348] EA = 0, S1PTW = 0 [ 54.661498] FSC = 0x04: level 0 translation fault [ 54.666391] Data abort info: [ 54.669273] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 54.674768] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 54.674771] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 54.674775] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000048860000 [ 54.674780] [0000000000000080] pgd=0000000000000000, p4d=0000000000000000 [ 54.703880] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 54.710152] Modules linked in: btnxpuart(-) overlay fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine authenc libdes crct10dif_ce polyval_ce polyval_generic snd_soc_imx_spdif snd_soc_imx_card snd_soc_ak5558 snd_soc_ak4458 caam secvio error snd_soc_fsl_micfil snd_soc_fsl_spdif snd_soc_fsl_sai snd_soc_fsl_utils imx_pcm_dma gpio_ir_recv rc_core sch_fq_codel fuse [ 54.744357] CPU: 3 PID: 72 Comm: kworker/u9:0 Not tainted 6.6.3-otbr-g128004619037 #2 [ 54.744364] Hardware name: FSL i.MX8MM EVK board (DT) [ 54.744368] Workqueue: hci0 hci_power_on [ 54.757244] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 54.757249] pc : kfree_skb_reason+0x18/0xb0 [ 54.772299] lr : btnxpuart_flush+0x40/0x58 [btnxpuart] [ 54.782921] sp : ffff8000805ebca0 [ 54.782923] x29: ffff8000805ebca0 x28: ffffa5c6cf1869c0 x27: ffffa5c6cf186000 [ 54.782931] x26: ffff377b84852400 x25: ffff377b848523c0 x24: ffff377b845e7230 [ 54.782938] x23: ffffa5c6ce8dbe08 x22: ffffa5c6ceb65410 x21: 00000000ffffff92 [ 54.782945] x20: ffffa5c6ce8dbe98 x19: ffffffffffffffac x18: ffffffffffffffff [ 54.807651] x17: 0000000000000000 x16: ffffa5c6ce2824ec x15: ffff8001005eb857 [ 54.821917] x14: 0000000000000000 x13: ffffa5c6cf1a02e0 x12: 0000000000000642 [ 54.821924] x11: 0000000000000040 x10: ffffa5c6cf19d690 x9 : ffffa5c6cf19d688 [ 54.821931] x8 : ffff377b86000028 x7 : 0000000000000000 x6 : 0000000000000000 [ 54.821938] x5 : ffff377b86000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 54.843331] x2 : 0000000000000000 x1 : 0000000000000002 x0 : ffffffffffffffac [ 54.857599] Call trace: [ 54.857601] kfree_skb_reason+0x18/0xb0 [ 54.863878] btnxpuart_flush+0x40/0x58 [btnxpuart] [ 54.863888] hci_dev_open_sync+0x3a8/0xa04 [ 54.872773] hci_power_on+0x54/0x2e4 [ 54.881832] process_one_work+0x138/0x260 [ 54.881842] worker_thread+0x32c/0x438 [ 54.881847] kthread+0x118/0x11c [ 54.881853] ret_from_fork+0x10/0x20 [ 54.896406] Code: a9be7bfd 910003fd f9000bf3 aa0003f3 (b940d400) [ 54.896410] ---[ end trace 0000000000000000 ]--- Signed-off-by: Neeraj Sanjay Kale <[email protected]> Tested-by: Guillaume Legoupil <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Sep 27, 2024
[ Upstream commit c3a5e3e ] When using cachefiles, lockdep may emit something similar to the circular locking dependency notice below. The problem appears to stem from the following: (1) Cachefiles manipulates xattrs on the files in its cache when called from ->writepages(). (2) The setxattr() and removexattr() system call handlers get the name (and value) from userspace after taking the sb_writers lock, putting accesses of the vma->vm_lock and mm->mmap_lock inside of that. (3) The afs filesystem uses a per-inode lock to prevent multiple revalidation RPCs and in writeback vs truncate to prevent parallel operations from deadlocking against the server on one side and local page locks on the other. Fix this by moving the getting of the name and value in {get,remove}xattr() outside of the sb_writers lock. This also has the minor benefits that we don't need to reget these in the event of a retry and we never try to take the sb_writers lock in the event we can't pull the name and value into the kernel. Alternative approaches that might fix this include moving the dispatch of a write to the cache off to a workqueue or trying to do without the validation lock in afs. Note that this might also affect other filesystems that use netfslib and/or cachefiles. ====================================================== WARNING: possible circular locking dependency detected 6.10.0-build2+ torvalds#956 Not tainted ------------------------------------------------------ fsstress/6050 is trying to acquire lock: ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0 but task is already holding lock: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&vma->vm_lock->lock){++++}-{3:3}: __lock_acquire+0xaf0/0xd80 lock_acquire.part.0+0x103/0x280 down_write+0x3b/0x50 vma_start_write+0x6b/0xa0 vma_link+0xcc/0x140 insert_vm_struct+0xb7/0xf0 alloc_bprm+0x2c1/0x390 kernel_execve+0x65/0x1a0 call_usermodehelper_exec_async+0x14d/0x190 ret_from_fork+0x24/0x40 ret_from_fork_asm+0x1a/0x30 -> #3 (&mm->mmap_lock){++++}-{3:3}: __lock_acquire+0xaf0/0xd80 lock_acquire.part.0+0x103/0x280 __might_fault+0x7c/0xb0 strncpy_from_user+0x25/0x160 removexattr+0x7f/0x100 __do_sys_fremovexattr+0x7e/0xb0 do_syscall_64+0x9f/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #2 (sb_writers#14){.+.+}-{0:0}: __lock_acquire+0xaf0/0xd80 lock_acquire.part.0+0x103/0x280 percpu_down_read+0x3c/0x90 vfs_iocb_iter_write+0xe9/0x1d0 __cachefiles_write+0x367/0x430 cachefiles_issue_write+0x299/0x2f0 netfs_advance_write+0x117/0x140 netfs_write_folio.isra.0+0x5ca/0x6e0 netfs_writepages+0x230/0x2f0 afs_writepages+0x4d/0x70 do_writepages+0x1e8/0x3e0 filemap_fdatawrite_wbc+0x84/0xa0 __filemap_fdatawrite_range+0xa8/0xf0 file_write_and_wait_range+0x59/0x90 afs_release+0x10f/0x270 __fput+0x25f/0x3d0 __do_sys_close+0x43/0x70 do_syscall_64+0x9f/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (&vnode->validate_lock){++++}-{3:3}: __lock_acquire+0xaf0/0xd80 lock_acquire.part.0+0x103/0x280 down_read+0x95/0x200 afs_writepages+0x37/0x70 do_writepages+0x1e8/0x3e0 filemap_fdatawrite_wbc+0x84/0xa0 filemap_invalidate_inode+0x167/0x1e0 netfs_unbuffered_write_iter+0x1bd/0x2d0 vfs_write+0x22e/0x320 ksys_write+0xbc/0x130 do_syscall_64+0x9f/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #0 (mapping.invalidate_lock#3){++++}-{3:3}: check_noncircular+0x119/0x160 check_prev_add+0x195/0x430 __lock_acquire+0xaf0/0xd80 lock_acquire.part.0+0x103/0x280 down_read+0x95/0x200 filemap_fault+0x26e/0x8b0 __do_fault+0x57/0xd0 do_pte_missing+0x23b/0x320 __handle_mm_fault+0x2d4/0x320 handle_mm_fault+0x14f/0x260 do_user_addr_fault+0x2a2/0x500 exc_page_fault+0x71/0x90 asm_exc_page_fault+0x22/0x30 other info that might help us debug this: Chain exists of: mapping.invalidate_lock#3 --> &mm->mmap_lock --> &vma->vm_lock->lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(&vma->vm_lock->lock); lock(&mm->mmap_lock); lock(&vma->vm_lock->lock); rlock(mapping.invalidate_lock#3); *** DEADLOCK *** 1 lock held by fsstress/6050: #0: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250 stack backtrace: CPU: 0 PID: 6050 Comm: fsstress Not tainted 6.10.0-build2+ torvalds#956 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x80 check_noncircular+0x119/0x160 ? queued_spin_lock_slowpath+0x4be/0x510 ? __pfx_check_noncircular+0x10/0x10 ? __pfx_queued_spin_lock_slowpath+0x10/0x10 ? mark_lock+0x47/0x160 ? init_chain_block+0x9c/0xc0 ? add_chain_block+0x84/0xf0 check_prev_add+0x195/0x430 __lock_acquire+0xaf0/0xd80 ? __pfx___lock_acquire+0x10/0x10 ? __lock_release.isra.0+0x13b/0x230 lock_acquire.part.0+0x103/0x280 ? filemap_fault+0x26e/0x8b0 ? __pfx_lock_acquire.part.0+0x10/0x10 ? rcu_is_watching+0x34/0x60 ? lock_acquire+0xd7/0x120 down_read+0x95/0x200 ? filemap_fault+0x26e/0x8b0 ? __pfx_down_read+0x10/0x10 ? __filemap_get_folio+0x25/0x1a0 filemap_fault+0x26e/0x8b0 ? __pfx_filemap_fault+0x10/0x10 ? find_held_lock+0x7c/0x90 ? __pfx___lock_release.isra.0+0x10/0x10 ? __pte_offset_map+0x99/0x110 __do_fault+0x57/0xd0 do_pte_missing+0x23b/0x320 __handle_mm_fault+0x2d4/0x320 ? __pfx___handle_mm_fault+0x10/0x10 handle_mm_fault+0x14f/0x260 do_user_addr_fault+0x2a2/0x500 exc_page_fault+0x71/0x90 asm_exc_page_fault+0x22/0x30 Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/r/[email protected] cc: Alexander Viro <[email protected]> cc: Christian Brauner <[email protected]> cc: Jan Kara <[email protected]> cc: Jeff Layton <[email protected]> cc: Gao Xiang <[email protected]> cc: Matthew Wilcox <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] [brauner: fix minor issues] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
[ Upstream commit 88fd2b7 ] Commit bab1c29 ("LoongArch: Fix sleeping in atomic context in setup_tlb_handler()") changes the gfp flag from GFP_KERNEL to GFP_ATOMIC for alloc_pages_node(). However, for PREEMPT_RT kernels we can still get a "sleeping in atomic context" error: [ 0.372259] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 [ 0.372266] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 [ 0.372268] preempt_count: 1, expected: 0 [ 0.372270] RCU nest depth: 1, expected: 1 [ 0.372272] 3 locks held by swapper/1/0: [ 0.372274] #0: 900000000c9f5e60 (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x524/0x1c60 [ 0.372294] #1: 90000000087013b8 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x50/0x140 [ 0.372305] #2: 900000047fffd388 (&zone->lock){+.+.}-{3:3}, at: __rmqueue_pcplist+0x30c/0xea0 [ 0.372314] irq event stamp: 0 [ 0.372316] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 0.372322] hardirqs last disabled at (0): [<9000000005947320>] copy_process+0x9c0/0x26e0 [ 0.372329] softirqs last enabled at (0): [<9000000005947320>] copy_process+0x9c0/0x26e0 [ 0.372335] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 0.372341] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.12.0-rc7+ #1891 [ 0.372346] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022 [ 0.372349] Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 9000000100388000 [ 0.372486] 900000010038b890 0000000000000000 900000010038b898 9000000007e53788 [ 0.372492] 900000000815bcc8 900000000815bcc0 900000010038b700 0000000000000001 [ 0.372498] 0000000000000001 4b031894b9d6b725 00000000055ec000 9000000100338fc0 [ 0.372503] 00000000000000c4 0000000000000001 000000000000002d 0000000000000003 [ 0.372509] 0000000000000030 0000000000000003 00000000055ec000 0000000000000003 [ 0.372515] 900000000806d000 9000000007e53788 00000000000000b0 0000000000000004 [ 0.372521] 0000000000000000 0000000000000000 900000000c9f5f10 0000000000000000 [ 0.372526] 90000000076f12d8 9000000007e53788 9000000005924778 0000000000000000 [ 0.372532] 00000000000000b0 0000000000000004 0000000000000000 0000000000070000 [ 0.372537] ... [ 0.372540] Call Trace: [ 0.372542] [<9000000005924778>] show_stack+0x38/0x180 [ 0.372548] [<90000000071519c4>] dump_stack_lvl+0x94/0xe4 [ 0.372555] [<900000000599b880>] __might_resched+0x1a0/0x260 [ 0.372561] [<90000000071675cc>] rt_spin_lock+0x4c/0x140 [ 0.372565] [<9000000005cbb768>] __rmqueue_pcplist+0x308/0xea0 [ 0.372570] [<9000000005cbed84>] get_page_from_freelist+0x564/0x1c60 [ 0.372575] [<9000000005cc0d98>] __alloc_pages_noprof+0x218/0x1820 [ 0.372580] [<900000000593b36c>] tlb_init+0x1ac/0x298 [ 0.372585] [<9000000005924b74>] per_cpu_trap_init+0x114/0x140 [ 0.372589] [<9000000005921964>] cpu_probe+0x4e4/0xa60 [ 0.372592] [<9000000005934874>] start_secondary+0x34/0xc0 [ 0.372599] [<900000000715615c>] smpboot_entry+0x64/0x6c This is because in PREEMPT_RT kernels normal spinlocks are replaced by rt spinlocks and rt_spin_lock() will cause sleeping. Fix it by disabling NUMA optimization completely for PREEMPT_RT kernels. Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
[ Upstream commit 2e3dbf9 ] Since the netlink attribute range validation provides inclusive checking, the *max* of attribute NL80211_ATTR_MLO_LINK_ID should be IEEE80211_MLD_MAX_NUM_LINKS - 1 otherwise causing an off-by-one. One crash stack for demonstration: ================================================================== BUG: KASAN: wild-memory-access in ieee80211_tx_control_port+0x3b6/0xca0 net/mac80211/tx.c:5939 Read of size 6 at addr 001102080000000c by task fuzzer.386/9508 CPU: 1 PID: 9508 Comm: syz.1.386 Not tainted 6.1.70 #2 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x177/0x231 lib/dump_stack.c:106 print_report+0xe0/0x750 mm/kasan/report.c:398 kasan_report+0x139/0x170 mm/kasan/report.c:495 kasan_check_range+0x287/0x290 mm/kasan/generic.c:189 memcpy+0x25/0x60 mm/kasan/shadow.c:65 ieee80211_tx_control_port+0x3b6/0xca0 net/mac80211/tx.c:5939 rdev_tx_control_port net/wireless/rdev-ops.h:761 [inline] nl80211_tx_control_port+0x7b3/0xc40 net/wireless/nl80211.c:15453 genl_family_rcv_msg_doit+0x22e/0x320 net/netlink/genetlink.c:756 genl_family_rcv_msg net/netlink/genetlink.c:833 [inline] genl_rcv_msg+0x539/0x740 net/netlink/genetlink.c:850 netlink_rcv_skb+0x1de/0x420 net/netlink/af_netlink.c:2508 genl_rcv+0x24/0x40 net/netlink/genetlink.c:861 netlink_unicast_kernel net/netlink/af_netlink.c:1326 [inline] netlink_unicast+0x74b/0x8c0 net/netlink/af_netlink.c:1352 netlink_sendmsg+0x882/0xb90 net/netlink/af_netlink.c:1874 sock_sendmsg_nosec net/socket.c:716 [inline] __sock_sendmsg net/socket.c:728 [inline] ____sys_sendmsg+0x5cc/0x8f0 net/socket.c:2499 ___sys_sendmsg+0x21c/0x290 net/socket.c:2553 __sys_sendmsg net/socket.c:2582 [inline] __do_sys_sendmsg net/socket.c:2591 [inline] __se_sys_sendmsg+0x19e/0x270 net/socket.c:2589 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x45/0x90 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x63/0xcd Update the policy to ensure correct validation. Fixes: 7b0a0e3 ("wifi: cfg80211: do some rework towards MLO link APIs") Signed-off-by: Lin Ma <[email protected]> Suggested-by: Cengiz Can <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
[ Upstream commit 05aa156 ] The mapping VMA address is saved in VAS window struct when the paste address is mapped. This VMA address is used during migration to unmap the paste address if the window is active. The paste address mapping will be removed when the window is closed or with the munmap(). But the VMA address in the VAS window is not updated with munmap() which is causing invalid access during migration. The KASAN report shows: [16386.254991] BUG: KASAN: slab-use-after-free in reconfig_close_windows+0x1a0/0x4e8 [16386.255043] Read of size 8 at addr c00000014a819670 by task drmgr/696928 [16386.255096] CPU: 29 UID: 0 PID: 696928 Comm: drmgr Kdump: loaded Tainted: G B 6.11.0-rc5-nxgzip #2 [16386.255128] Tainted: [B]=BAD_PAGE [16386.255148] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200 0xf000007 of:IBM,FW1110.00 (NH1110_016) hv:phyp pSeries [16386.255181] Call Trace: [16386.255202] [c00000016b297660] [c0000000018ad0ac] dump_stack_lvl+0x84/0xe8 (unreliable) [16386.255246] [c00000016b297690] [c0000000006e8a90] print_report+0x19c/0x764 [16386.255285] [c00000016b297760] [c0000000006e9490] kasan_report+0x128/0x1f8 [16386.255309] [c00000016b297880] [c0000000006eb5c8] __asan_load8+0xac/0xe0 [16386.255326] [c00000016b2978a0] [c00000000013f898] reconfig_close_windows+0x1a0/0x4e8 [16386.255343] [c00000016b297990] [c000000000140e58] vas_migration_handler+0x3a4/0x3fc [16386.255368] [c00000016b297a90] [c000000000128848] pseries_migrate_partition+0x4c/0x4c4 ... [16386.256136] Allocated by task 696554 on cpu 31 at 16377.277618s: [16386.256149] kasan_save_stack+0x34/0x68 [16386.256163] kasan_save_track+0x34/0x80 [16386.256175] kasan_save_alloc_info+0x58/0x74 [16386.256196] __kasan_slab_alloc+0xb8/0xdc [16386.256209] kmem_cache_alloc_noprof+0x200/0x3d0 [16386.256225] vm_area_alloc+0x44/0x150 [16386.256245] mmap_region+0x214/0x10c4 [16386.256265] do_mmap+0x5fc/0x750 [16386.256277] vm_mmap_pgoff+0x14c/0x24c [16386.256292] ksys_mmap_pgoff+0x20c/0x348 [16386.256303] sys_mmap+0xd0/0x160 ... [16386.256350] Freed by task 0 on cpu 31 at 16386.204848s: [16386.256363] kasan_save_stack+0x34/0x68 [16386.256374] kasan_save_track+0x34/0x80 [16386.256384] kasan_save_free_info+0x64/0x10c [16386.256396] __kasan_slab_free+0x120/0x204 [16386.256415] kmem_cache_free+0x128/0x450 [16386.256428] vm_area_free_rcu_cb+0xa8/0xd8 [16386.256441] rcu_do_batch+0x2c8/0xcf0 [16386.256458] rcu_core+0x378/0x3c4 [16386.256473] handle_softirqs+0x20c/0x60c [16386.256495] do_softirq_own_stack+0x6c/0x88 [16386.256509] do_softirq_own_stack+0x58/0x88 [16386.256521] __irq_exit_rcu+0x1a4/0x20c [16386.256533] irq_exit+0x20/0x38 [16386.256544] interrupt_async_exit_prepare.constprop.0+0x18/0x2c ... [16386.256717] Last potentially related work creation: [16386.256729] kasan_save_stack+0x34/0x68 [16386.256741] __kasan_record_aux_stack+0xcc/0x12c [16386.256753] __call_rcu_common.constprop.0+0x94/0xd04 [16386.256766] vm_area_free+0x28/0x3c [16386.256778] remove_vma+0xf4/0x114 [16386.256797] do_vmi_align_munmap.constprop.0+0x684/0x870 [16386.256811] __vm_munmap+0xe0/0x1f8 [16386.256821] sys_munmap+0x54/0x6c [16386.256830] system_call_exception+0x1a0/0x4a0 [16386.256841] system_call_vectored_common+0x15c/0x2ec [16386.256868] The buggy address belongs to the object at c00000014a819670 which belongs to the cache vm_area_struct of size 168 [16386.256887] The buggy address is located 0 bytes inside of freed 168-byte region [c00000014a819670, c00000014a819718) [16386.256915] The buggy address belongs to the physical page: [16386.256928] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14a81 [16386.256950] memcg:c0000000ba430001 [16386.256961] anon flags: 0x43ffff800000000(node=4|zone=0|lastcpupid=0x7ffff) [16386.256975] page_type: 0xfdffffff(slab) [16386.256990] raw: 043ffff800000000 c00000000501c080 0000000000000000 5deadbee00000001 [16386.257003] raw: 0000000000000000 00000000011a011a 00000001fdffffff c0000000ba430001 [16386.257018] page dumped because: kasan: bad access detected This patch adds close() callback in vas_vm_ops vm_operations_struct which will be executed during munmap() before freeing VMA. The VMA address in the VAS window is set to NULL after holding the window mmap_mutex. Fixes: 37e6764 ("powerpc/pseries/vas: Add VAS migration handler") Signed-off-by: Haren Myneni <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
[ Upstream commit 8fac326 ] When I try to manually set bitrates: iw wlan0 set bitrates legacy-2.4 1 I get sleeping from invalid context error, see below. Fix that by switching to use recently introduced ieee80211_iterate_stations_mtx(). Do note that WCN6855 firmware is still crashing, I'm not sure if that firmware even supports bitrate WMI commands and should we consider disabling ath12k_mac_op_set_bitrate_mask() for WCN6855? But that's for another patch. BUG: sleeping function called from invalid context at drivers/net/wireless/ath/ath12k/wmi.c:420 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 2236, name: iw preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 3 locks held by iw/2236: #0: ffffffffabc6f1d8 (cb_lock){++++}-{3:3}, at: genl_rcv+0x14/0x40 #1: ffff888138410810 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_pre_doit+0x54d/0x800 [cfg80211] #2: ffffffffab2cfaa0 (rcu_read_lock){....}-{1:2}, at: ieee80211_iterate_stations_atomic+0x2f/0x200 [mac80211] CPU: 3 UID: 0 PID: 2236 Comm: iw Not tainted 6.11.0-rc7-wt-ath+ #1772 Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021 Call Trace: <TASK> dump_stack_lvl+0xa4/0xe0 dump_stack+0x10/0x20 __might_resched+0x363/0x5a0 ? __alloc_skb+0x165/0x340 __might_sleep+0xad/0x160 ath12k_wmi_cmd_send+0xb1/0x3d0 [ath12k] ? ath12k_wmi_init_wcn7850+0xa40/0xa40 [ath12k] ? __netdev_alloc_skb+0x45/0x7b0 ? __asan_memset+0x39/0x40 ? ath12k_wmi_alloc_skb+0xf0/0x150 [ath12k] ? reacquire_held_locks+0x4d0/0x4d0 ath12k_wmi_set_peer_param+0x340/0x5b0 [ath12k] ath12k_mac_disable_peer_fixed_rate+0xa3/0x110 [ath12k] ? ath12k_mac_vdev_stop+0x4f0/0x4f0 [ath12k] ieee80211_iterate_stations_atomic+0xd4/0x200 [mac80211] ath12k_mac_op_set_bitrate_mask+0x5d2/0x1080 [ath12k] ? ath12k_mac_vif_chan+0x320/0x320 [ath12k] drv_set_bitrate_mask+0x267/0x470 [mac80211] ieee80211_set_bitrate_mask+0x4cc/0x8a0 [mac80211] ? __this_cpu_preempt_check+0x13/0x20 nl80211_set_tx_bitrate_mask+0x2bc/0x530 [cfg80211] ? nl80211_parse_tx_bitrate_mask+0x2320/0x2320 [cfg80211] ? trace_contention_end+0xef/0x140 ? rtnl_unlock+0x9/0x10 ? nl80211_pre_doit+0x557/0x800 [cfg80211] genl_family_rcv_msg_doit+0x1f0/0x2e0 ? genl_family_rcv_msg_attrs_parse.isra.0+0x250/0x250 ? ns_capable+0x57/0xd0 genl_family_rcv_msg+0x34c/0x600 ? genl_family_rcv_msg_dumpit+0x310/0x310 ? __lock_acquire+0xc62/0x1de0 ? he_set_mcs_mask.isra.0+0x8d0/0x8d0 [cfg80211] ? nl80211_parse_tx_bitrate_mask+0x2320/0x2320 [cfg80211] ? cfg80211_external_auth_request+0x690/0x690 [cfg80211] genl_rcv_msg+0xa0/0x130 netlink_rcv_skb+0x14c/0x400 ? genl_family_rcv_msg+0x600/0x600 ? netlink_ack+0xd70/0xd70 ? rwsem_optimistic_spin+0x4f0/0x4f0 ? genl_rcv+0x14/0x40 ? down_read_killable+0x580/0x580 ? netlink_deliver_tap+0x13e/0x350 ? __this_cpu_preempt_check+0x13/0x20 genl_rcv+0x23/0x40 netlink_unicast+0x45e/0x790 ? netlink_attachskb+0x7f0/0x7f0 netlink_sendmsg+0x7eb/0xdb0 ? netlink_unicast+0x790/0x790 ? __this_cpu_preempt_check+0x13/0x20 ? selinux_socket_sendmsg+0x31/0x40 ? netlink_unicast+0x790/0x790 __sock_sendmsg+0xc9/0x160 ____sys_sendmsg+0x620/0x990 ? kernel_sendmsg+0x30/0x30 ? __copy_msghdr+0x410/0x410 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xe6/0x1470 ___sys_sendmsg+0xe9/0x170 ? copy_msghdr_from_user+0x120/0x120 ? __lock_acquire+0xc62/0x1de0 ? do_fault_around+0x2c6/0x4e0 ? do_user_addr_fault+0x8c1/0xde0 ? reacquire_held_locks+0x220/0x4d0 ? do_user_addr_fault+0x8c1/0xde0 ? __kasan_check_read+0x11/0x20 ? __fdget+0x4e/0x1d0 ? sockfd_lookup_light+0x1a/0x170 __sys_sendmsg+0xd2/0x180 ? __sys_sendmsg_sock+0x20/0x20 ? reacquire_held_locks+0x4d0/0x4d0 ? debug_smp_processor_id+0x17/0x20 __x64_sys_sendmsg+0x72/0xb0 ? lockdep_hardirqs_on+0x7d/0x100 x64_sys_call+0x894/0x9f0 do_syscall_64+0x64/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f230fe04807 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffe996a7ea8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000556f9f9c3390 RCX: 00007f230fe04807 RDX: 0000000000000000 RSI: 00007ffe996a7ee0 RDI: 0000000000000003 RBP: 0000556f9f9c88c0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000556f965ca190 R11: 0000000000000246 R12: 0000556f9f9c8780 R13: 00007ffe996a7ee0 R14: 0000556f9f9c87d0 R15: 0000556f9f9c88c0 </TASK> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jeff Johnson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
[ Upstream commit 542ed81 ] Access to genmask field in struct nft_set_ext results in unaligned atomic read: [ 72.130109] Unable to handle kernel paging request at virtual address ffff0000c2bb708c [ 72.131036] Mem abort info: [ 72.131213] ESR = 0x0000000096000021 [ 72.131446] EC = 0x25: DABT (current EL), IL = 32 bits [ 72.132209] SET = 0, FnV = 0 [ 72.133216] EA = 0, S1PTW = 0 [ 72.134080] FSC = 0x21: alignment fault [ 72.135593] Data abort info: [ 72.137194] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000 [ 72.142351] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 72.145989] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 72.150115] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000237d27000 [ 72.154893] [ffff0000c2bb708c] pgd=0000000000000000, p4d=180000023ffff403, pud=180000023f84b403, pmd=180000023f835403, +pte=0068000102bb7707 [ 72.163021] Internal error: Oops: 0000000096000021 [#1] SMP [...] [ 72.170041] CPU: 7 UID: 0 PID: 54 Comm: kworker/7:0 Tainted: G E 6.13.0-rc3+ #2 [ 72.170509] Tainted: [E]=UNSIGNED_MODULE [ 72.170720] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-stable202302-for-qemu 03/01/2023 [ 72.171192] Workqueue: events_power_efficient nft_rhash_gc [nf_tables] [ 72.171552] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 72.171915] pc : nft_rhash_gc+0x200/0x2d8 [nf_tables] [ 72.172166] lr : nft_rhash_gc+0x128/0x2d8 [nf_tables] [ 72.172546] sp : ffff800081f2bce0 [ 72.172724] x29: ffff800081f2bd40 x28: ffff0000c2bb708c x27: 0000000000000038 [ 72.173078] x26: ffff0000c6780ef0 x25: ffff0000c643df00 x24: ffff0000c6778f78 [ 72.173431] x23: 000000000000001a x22: ffff0000c4b1f000 x21: ffff0000c6780f78 [ 72.173782] x20: ffff0000c2bb70dc x19: ffff0000c2bb7080 x18: 0000000000000000 [ 72.174135] x17: ffff0000c0a4e1c0 x16: 0000000000003000 x15: 0000ac26d173b978 [ 72.174485] x14: ffffffffffffffff x13: 0000000000000030 x12: ffff0000c6780ef0 [ 72.174841] x11: 0000000000000000 x10: ffff800081f2bcf8 x9 : ffff0000c3000000 [ 72.175193] x8 : 00000000000004be x7 : 0000000000000000 x6 : 0000000000000000 [ 72.175544] x5 : 0000000000000040 x4 : ffff0000c3000010 x3 : 0000000000000000 [ 72.175871] x2 : 0000000000003a98 x1 : ffff0000c2bb708c x0 : 0000000000000004 [ 72.176207] Call trace: [ 72.176316] nft_rhash_gc+0x200/0x2d8 [nf_tables] (P) [ 72.176653] process_one_work+0x178/0x3d0 [ 72.176831] worker_thread+0x200/0x3f0 [ 72.176995] kthread+0xe8/0xf8 [ 72.177130] ret_from_fork+0x10/0x20 [ 72.177289] Code: 54fff984 d503201f d2800080 91003261 (f820303f) [ 72.177557] ---[ end trace 0000000000000000 ]--- Align struct nft_set_ext to word size to address this and documentation it. pahole reports that this increases the size of elements for rhash and pipapo in 8 bytes on x86_64. Fixes: 7ffc748 ("netfilter: nft_set_hash: skip duplicated elements pending gc run") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
[ Upstream commit 4d94f05 ] This reworks hci_cb_list to not use mutex hci_cb_list_lock to avoid bugs like the bellow: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 5070, name: kworker/u9:2 preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 4 locks held by kworker/u9:2/5070: #0: ffff888015be3948 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff888015be3948 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x1770 kernel/workqueue.c:3335 #1: ffffc90003b6fd00 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc90003b6fd00 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x1770 kernel/workqueue.c:3335 #2: ffff8880665d0078 (&hdev->lock){+.+.}-{3:3}, at: hci_le_create_big_complete_evt+0xcf/0xae0 net/bluetooth/hci_event.c:6914 #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline] #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline] #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: hci_le_create_big_complete_evt+0xdb/0xae0 net/bluetooth/hci_event.c:6915 CPU: 0 PID: 5070 Comm: kworker/u9:2 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: hci0 hci_rx_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 __might_resched+0x5d4/0x780 kernel/sched/core.c:10187 __mutex_lock_common kernel/locking/mutex.c:585 [inline] __mutex_lock+0xc1/0xd70 kernel/locking/mutex.c:752 hci_connect_cfm include/net/bluetooth/hci_core.h:2004 [inline] hci_le_create_big_complete_evt+0x3d9/0xae0 net/bluetooth/hci_event.c:6939 hci_event_func net/bluetooth/hci_event.c:7514 [inline] hci_event_packet+0xa53/0x1540 net/bluetooth/hci_event.c:7569 hci_rx_work+0x3e8/0xca0 net/bluetooth/hci_core.c:4171 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 </TASK> Reported-by: [email protected] Tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=2fb0835e0c9cefc34614 Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
…le_direct_reclaim() commit 6aaced5 upstream. The task sometimes continues looping in throttle_direct_reclaim() because allow_direct_reclaim(pgdat) keeps returning false. #0 [ffff80002cb6f8d0] __switch_to at ffff8000080095ac #1 [ffff80002cb6f900] __schedule at ffff800008abbd1c #2 [ffff80002cb6f990] schedule at ffff800008abc50c #3 [ffff80002cb6f9b0] throttle_direct_reclaim at ffff800008273550 #4 [ffff80002cb6fa20] try_to_free_pages at ffff800008277b68 #5 [ffff80002cb6fae0] __alloc_pages_nodemask at ffff8000082c4660 #6 [ffff80002cb6fc50] alloc_pages_vma at ffff8000082e4a98 torvalds#7 [ffff80002cb6fca0] do_anonymous_page at ffff80000829f5a8 torvalds#8 [ffff80002cb6fce0] __handle_mm_fault at ffff8000082a5974 torvalds#9 [ffff80002cb6fd90] handle_mm_fault at ffff8000082a5bd4 At this point, the pgdat contains the following two zones: NODE: 4 ZONE: 0 ADDR: ffff00817fffe540 NAME: "DMA32" SIZE: 20480 MIN/LOW/HIGH: 11/28/45 VM_STAT: NR_FREE_PAGES: 359 NR_ZONE_INACTIVE_ANON: 18813 NR_ZONE_ACTIVE_ANON: 0 NR_ZONE_INACTIVE_FILE: 50 NR_ZONE_ACTIVE_FILE: 0 NR_ZONE_UNEVICTABLE: 0 NR_ZONE_WRITE_PENDING: 0 NR_MLOCK: 0 NR_BOUNCE: 0 NR_ZSPAGES: 0 NR_FREE_CMA_PAGES: 0 NODE: 4 ZONE: 1 ADDR: ffff00817fffec00 NAME: "Normal" SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264 VM_STAT: NR_FREE_PAGES: 146 NR_ZONE_INACTIVE_ANON: 94668 NR_ZONE_ACTIVE_ANON: 3 NR_ZONE_INACTIVE_FILE: 735 NR_ZONE_ACTIVE_FILE: 78 NR_ZONE_UNEVICTABLE: 0 NR_ZONE_WRITE_PENDING: 0 NR_MLOCK: 0 NR_BOUNCE: 0 NR_ZSPAGES: 0 NR_FREE_CMA_PAGES: 0 In allow_direct_reclaim(), while processing ZONE_DMA32, the sum of inactive/active file-backed pages calculated in zone_reclaimable_pages() based on the result of zone_page_state_snapshot() is zero. Additionally, since this system lacks swap, the calculation of inactive/ active anonymous pages is skipped. crash> p nr_swap_pages nr_swap_pages = $1937 = { counter = 0 } As a result, ZONE_DMA32 is deemed unreclaimable and skipped, moving on to the processing of the next zone, ZONE_NORMAL, despite ZONE_DMA32 having free pages significantly exceeding the high watermark. The problem is that the pgdat->kswapd_failures hasn't been incremented. crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_failures $1935 = 0x0 This is because the node deemed balanced. The node balancing logic in balance_pgdat() evaluates all zones collectively. If one or more zones (e.g., ZONE_DMA32) have enough free pages to meet their watermarks, the entire node is deemed balanced. This causes balance_pgdat() to exit early before incrementing the kswapd_failures, as it considers the overall memory state acceptable, even though some zones (like ZONE_NORMAL) remain under significant pressure. The patch ensures that zone_reclaimable_pages() includes free pages (NR_FREE_PAGES) in its calculation when no other reclaimable pages are available (e.g., file-backed or anonymous pages). This change prevents zones like ZONE_DMA32, which have sufficient free pages, from being mistakenly deemed unreclaimable. By doing so, the patch ensures proper node balancing, avoids masking pressure on other zones like ZONE_NORMAL, and prevents infinite loops in throttle_direct_reclaim() caused by allow_direct_reclaim(pgdat) repeatedly returning false. The kernel hangs due to a task stuck in throttle_direct_reclaim(), caused by a node being incorrectly deemed balanced despite pressure in certain zones, such as ZONE_NORMAL. This issue arises from zone_reclaimable_pages() returning 0 for zones without reclaimable file- backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient free pages to be skipped. The lack of swap or reclaimable pages results in ZONE_DMA32 being ignored during reclaim, masking pressure in other zones. Consequently, pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback mechanisms in allow_direct_reclaim() from being triggered, leading to an infinite loop in throttle_direct_reclaim(). This patch modifies zone_reclaimable_pages() to account for free pages (NR_FREE_PAGES) when no other reclaimable pages exist. This ensures zones with sufficient free pages are not skipped, enabling proper balancing and reclaim behavior. [[email protected]: coding-style cleanups] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 5a1c84b ("mm: remove reclaim and compaction retry approximations") Signed-off-by: Seiji Nishikawa <[email protected]> Cc: Mel Gorman <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
[ Upstream commit 5641e82 ] Clear the port select structure on error so no stale values left after definers are destroyed. That's because the mlx5_lag_destroy_definers() always try to destroy all lag definers in the tt_map, so in the flow below lag definers get double-destroyed and cause kernel crash: mlx5_lag_port_sel_create() mlx5_lag_create_definers() mlx5_lag_create_definer() <- Failed on tt 1 mlx5_lag_destroy_definers() <- definers[tt=0] gets destroyed mlx5_lag_port_sel_create() mlx5_lag_create_definers() mlx5_lag_create_definer() <- Failed on tt 0 mlx5_lag_destroy_definers() <- definers[tt=0] gets double-destroyed Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 64k pages, 48-bit VAs, pgdp=0000000112ce2e00 [0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Modules linked in: iptable_raw bonding ip_gre ip6_gre gre ip6_tunnel tunnel6 geneve ip6_udp_tunnel udp_tunnel ipip tunnel4 ip_tunnel rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_fwctl(OE) fwctl(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlxfw(OE) memtrack(OE) mlx_compat(OE) openvswitch nsh nf_conncount psample xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc netconsole overlay efi_pstore sch_fq_codel zram ip_tables crct10dif_ce qemu_fw_cfg fuse ipv6 crc_ccitt [last unloaded: mlx_compat(OE)] CPU: 3 UID: 0 PID: 217 Comm: kworker/u53:2 Tainted: G OE 6.11.0+ #2 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core] lr : mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core] sp : ffff800085fafb00 x29: ffff800085fafb00 x28: ffff0000da0c8000 x27: 0000000000000000 x26: ffff0000da0c8000 x25: ffff0000da0c8000 x24: ffff0000da0c8000 x23: ffff0000c31f81a0 x22: 0400000000000000 x21: ffff0000da0c8000 x20: 0000000000000000 x19: 0000000000000001 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8b0c9350 x14: 0000000000000000 x13: ffff800081390d18 x12: ffff800081dc3cc0 x11: 0000000000000001 x10: 0000000000000b10 x9 : ffff80007ab7304c x8 : ffff0000d00711f0 x7 : 0000000000000004 x6 : 0000000000000190 x5 : ffff00027edb3010 x4 : 0000000000000000 x3 : 0000000000000000 x2 : ffff0000d39b8000 x1 : ffff0000d39b8000 x0 : 0400000000000000 Call trace: mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core] mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core] mlx5_lag_destroy_definers+0xa0/0x108 [mlx5_core] mlx5_lag_port_sel_create+0x2d4/0x6f8 [mlx5_core] mlx5_activate_lag+0x60c/0x6f8 [mlx5_core] mlx5_do_bond_work+0x284/0x5c8 [mlx5_core] process_one_work+0x170/0x3e0 worker_thread+0x2d8/0x3e0 kthread+0x11c/0x128 ret_from_fork+0x10/0x20 Code: a9025bf5 aa0003f6 a90363f7 f90023f9 (f9400400) ---[ end trace 0000000000000000 ]--- Fixes: dc48516 ("net/mlx5: Lag, add support to create definers for LAG") Signed-off-by: Mark Zhang <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ConchuOD
pushed a commit
that referenced
this pull request
Mar 3, 2025
commit 9860370 upstream. irq_chip functions may be called in raw spinlock context. Therefore, we must also use a raw spinlock for our own internal locking. This fixes the following lockdep splat: [ 5.349336] ============================= [ 5.353349] [ BUG: Invalid wait context ] [ 5.357361] 6.13.0-rc5+ torvalds#69 Tainted: G W [ 5.363031] ----------------------------- [ 5.367045] kworker/u17:1/44 is trying to lock: [ 5.371587] ffffff88018b02c0 (&chip->gpio_lock){....}-{3:3}, at: xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.380079] other info that might help us debug this: [ 5.385138] context-{5:5} [ 5.387762] 5 locks held by kworker/u17:1/44: [ 5.392123] #0: ffffff8800014958 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3204) [ 5.402260] #1: ffffffc082fcbdd8 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3205) [ 5.411528] #2: ffffff880172c900 (&dev->mutex){....}-{4:4}, at: __device_attach (drivers/base/dd.c:1006) [ 5.419929] #3: ffffff88039c8268 (request_class#2){+.+.}-{4:4}, at: __setup_irq (kernel/irq/internals.h:156 kernel/irq/manage.c:1596) [ 5.428331] #4: ffffff88039c80c8 (lock_class#2){....}-{2:2}, at: __setup_irq (kernel/irq/manage.c:1614) [ 5.436472] stack backtrace: [ 5.439359] CPU: 2 UID: 0 PID: 44 Comm: kworker/u17:1 Tainted: G W 6.13.0-rc5+ torvalds#69 [ 5.448690] Tainted: [W]=WARN [ 5.451656] Hardware name: xlnx,zynqmp (DT) [ 5.455845] Workqueue: events_unbound deferred_probe_work_func [ 5.461699] Call trace: [ 5.464147] show_stack+0x18/0x24 C [ 5.467821] dump_stack_lvl (lib/dump_stack.c:123) [ 5.471501] dump_stack (lib/dump_stack.c:130) [ 5.474824] __lock_acquire (kernel/locking/lockdep.c:4828 kernel/locking/lockdep.c:4898 kernel/locking/lockdep.c:5176) [ 5.478758] lock_acquire (arch/arm64/include/asm/percpu.h:40 kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851 kernel/locking/lockdep.c:5814) [ 5.482429] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 5.486797] xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.490737] irq_enable (kernel/irq/internals.h:236 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345) [ 5.494060] __irq_startup (kernel/irq/internals.h:241 kernel/irq/chip.c:180 kernel/irq/chip.c:250) [ 5.497645] irq_startup (kernel/irq/chip.c:270) [ 5.501143] __setup_irq (kernel/irq/manage.c:1807) [ 5.504728] request_threaded_irq (kernel/irq/manage.c:2208) Fixes: a32c7ca ("gpio: gpio-xilinx: Add interrupt support") Signed-off-by: Sean Anderson <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bartosz Golaszewski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ConchuOD
added a commit
that referenced
this pull request
Mar 14, 2025
Clearly, my previous fix was lacking, and there's still a build break. Fix it. Signed-off-by: Conor Dooley <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.