Fix for paddr == 0 for VSP #41

andr2000 · 2018-08-27T05:58:07Z

This solution was pointed to us by Renesas, they are about to release/upstream it soon.
The solution is to revert [1] and apply [2].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/drivers/gpu/drm/rcar-du?h=v4.14/rcar-3.7.0&id=1d9ad6c074a831131dd39d0a27ed8bc11335de4a
[2] https://patchwork.freedesktop.org/series/33702/

This was tested on M3 with display backend running in DRM mode

andr2000 · 2018-08-28T07:46:47Z

@iartemenko pls merge

This reverts commit 7ac2140.

This reverts commit 1d9ad6c.

The DU DMA address space is limited to 32 bits, so the DMA coherent mask should be set accordingly. The DMA mapping implementation will transparently map high memory buffers to 32-bit addresses through an IOMMU when present (or through bounce buffers otherwise, which isn't a supported use case as performances would be terrible). However, when sourcing frames from a VSP, the situation is more complicated. The DU delegates all memory accesses to the VSP and doesn't perform any DMA access by itself. Due to how the GEM CMA helpers are structured buffers are still mapped to the DU device. They are later mapped to the VSP as well to perform DMA access, through the IOMMU connected to the VSP. Setting the DMA coherent mask to 32 bits for the DU when using a VSP can cause issues when importing a dma_buf. If the buffer is located above the 32-bit address space, the DMA mapping implementation will try to map it to the DU's DMA address space. As the DU has no IOMMU a bounce buffer will be allocated, which in the best case will waste memory and in the worst case will just fail. To work around this issue, set the DMA coherent mask to the full 40-bit address space for the DU. All dma-buf instances will be imported without any restriction, and will be mapped to the VSP when preparing the associated framebuffer. Signed-off-by: Laurent Pinchart <[email protected]> Reviewed-by: Liviu Dudau <[email protected]>

When the DU sources its frames from a VSP, it performs no memory access and thus has no requirements on imported dma-buf memory types. In particular the DU could import a physically non-contiguous buffer that would later be mapped contiguously through the VSP IOMMU. This use case isn't supported at the moment as the GEM CMA helpers will reject any non-contiguous buffer, and the DU isn't connected to an IOMMU that can make the buffer contiguous for DMA. Fix this by implementing a custom .gem_prime_import_sg_table() operation that accepts all imported dma-buf regardless of the number of scatterlist entries. Signed-off-by: Laurent Pinchart <[email protected]>

Increase kasan instrumented kernel stack size from 32k to 64k. Other architectures seems to get away with just doubling kernel stack size under kasan, but on s390 this appears to be not enough due to bigger frame size. The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting stack overflow is fs sync on xfs filesystem: #0 [9a0681e8] 704 bytes check_usage at 34b1fc #1 [9a0684a8] 432 bytes check_usage at 34c710 #2 [9a068658] 1048 bytes validate_chain at 35044a #3 [9a068a70] 312 bytes __lock_acquire at 3559fe xen-troops#4 [9a068ba8] 440 bytes lock_acquire at 3576ee xen-troops#5 [9a068d60] 104 bytes _raw_spin_lock at 21b44e0 xen-troops#6 [9a068dc8] 1992 bytes enqueue_entity at 2dbf72 xen-troops#7 [9a069590] 1496 bytes enqueue_task_fair at 2df5f0 xen-troops#8 [9a069b68] 64 bytes ttwu_do_activate at 28f438 xen-troops#9 [9a069ba8] 552 bytes try_to_wake_up at 298c4c xen-troops#10 [9a069dd0] 168 bytes wake_up_worker at 23f97c xen-troops#11 [9a069e78] 200 bytes insert_work at 23fc2e xen-troops#12 [9a069f40] 648 bytes __queue_work at 2487c0 xen-troops#13 [9a06a1c8] 200 bytes __queue_delayed_work at 24db28 xen-troops#14 [9a06a290] 248 bytes mod_delayed_work_on at 24de84 xen-troops#15 [9a06a388] 24 bytes kblockd_mod_delayed_work_on at 153e2a0 xen-troops#16 [9a06a3a] 288 bytes __blk_mq_delay_run_hw_queue at 158168c xen-troops#17 [9a06a4c0] 192 bytes blk_mq_run_hw_queue at 1581a3c xen-troops#18 [9a06a58] 184 bytes blk_mq_sched_insert_requests at 15a2192 xen-troops#19 [9a06a638] 1024 bytes blk_mq_flush_plug_list at 1590f3a xen-troops#20 [9a06aa38] 704 bytes blk_flush_plug_list at 1555028 xen-troops#21 [9a06acf8] 320 bytes schedule at 219e476 xen-troops#22 [9a06ae38] 760 bytes schedule_timeout at 21b0aac xen-troops#23 [9a06b130] 408 bytes wait_for_common at 21a1706 xen-troops#24 [9a06b2c8] 360 bytes xfs_buf_iowait at fa1540 xen-troops#25 [9a06b430] 256 bytes __xfs_buf_submit at fadae6 xen-troops#26 [9a06b530] 264 bytes xfs_buf_read_map at fae3f6 xen-troops#27 [9a06b638] 656 bytes xfs_trans_read_buf_map at 10ac9a8 xen-troops#28 [9a06b8c8] 304 bytes xfs_btree_kill_root at e72426 xen-troops#29 [9a06b9f8] 288 bytes xfs_btree_lookup_get_block at e7bc5e xen-troops#30 [9a06bb18] 624 bytes xfs_btree_lookup at e7e1a6 xen-troops#31 [9a06bd88] 2664 bytes xfs_alloc_ag_vextent_near at dfa070 xen-troops#32 [9a06c7f0] 144 bytes xfs_alloc_ag_vextent at dff3ca xen-troops#33 [9a06c880] 1128 bytes xfs_alloc_vextent at e05fce xen-troops#34 [9a06cce8] 584 bytes xfs_bmap_btalloc at e58342 xen-troops#35 [9a06cf30] 1336 bytes xfs_bmapi_write at e618de xen-troops#36 [9a06d468] 776 bytes xfs_iomap_write_allocate at ff678e xen-troops#37 [9a06d770] 720 bytes xfs_map_blocks at f82af8 xen-troops#38 [9a06da40] 928 bytes xfs_writepage_map at f83cd6 xen-troops#39 [9a06dde0] 320 bytes xfs_do_writepage at f85872 xen-troops#40 [9a06df20] 1320 bytes write_cache_pages at 73dfe8 xen-troops#41 [9a06e448] 208 bytes xfs_vm_writepages at f7f892 xen-troops#42 [9a06e518] 88 bytes do_writepages at 73fe6a xen-troops#43 [9a06e57] 872 bytes __writeback_single_inode at a20cb6 xen-troops#44 [9a06e8d8] 664 bytes writeback_sb_inodes at a23be2 xen-troops#45 [9a06eb70] 296 bytes __writeback_inodes_wb at a242e0 xen-troops#46 [9a06ec98] 928 bytes wb_writeback at a2500e xen-troops#47 [9a06f038] 848 bytes wb_do_writeback at a260ae xen-troops#48 [9a06f388] 536 bytes wb_workfn at a28228 xen-troops#49 [9a06f5a0] 1088 bytes process_one_work at 24a234 xen-troops#50 [9a06f9e0] 1120 bytes worker_thread at 24ba26 xen-troops#51 [9a06fe40] 104 bytes kthread at 26545a xen-troops#52 [9a06fea] kernel_thread_starter at 21b6b62 To be able to increase the stack size to 64k reuse LLILL instruction in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE (65192) value as unsigned. Reported-by: Benjamin Block <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>

Fix double-free that happens when thermal zone setup fails, see KASAN log below. ================================================================== BUG: KASAN: double-free or invalid-free in __hwmon_device_register+0x5dc/0xa7c CPU: 0 PID: 132 Comm: kworker/0:2 Tainted: G B 4.19.0-rc8-next-20181016-00042-gb52cd80401e9-dirty xen-troops#41 Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) Workqueue: events deferred_probe_work_func Backtrace: [<c0110540>] (dump_backtrace) from [<c0110944>] (show_stack+0x20/0x24) [<c0110924>] (show_stack) from [<c105cb08>] (dump_stack+0x9c/0xb0) [<c105ca6c>] (dump_stack) from [<c02fdaec>] (print_address_description+0x68/0x250) [<c02fda84>] (print_address_description) from [<c02fd4ac>] (kasan_report_invalid_free+0x68/0x88) [<c02fd444>] (kasan_report_invalid_free) from [<c02fc85c>] (__kasan_slab_free+0x1f4/0x200) [<c02fc668>] (__kasan_slab_free) from [<c02fd0c0>] (kasan_slab_free+0x14/0x18) [<c02fd0ac>] (kasan_slab_free) from [<c02f9c6c>] (kfree+0x90/0x294) [<c02f9bdc>] (kfree) from [<c0b41bbc>] (__hwmon_device_register+0x5dc/0xa7c) [<c0b415e0>] (__hwmon_device_register) from [<c0b421e8>] (hwmon_device_register_with_info+0xa0/0xa8) [<c0b42148>] (hwmon_device_register_with_info) from [<c0b42324>] (devm_hwmon_device_register_with_info+0x74/0xb4) [<c0b422b0>] (devm_hwmon_device_register_with_info) from [<c0b4481c>] (lm90_probe+0x414/0x578) [<c0b44408>] (lm90_probe) from [<c0aeeff4>] (i2c_device_probe+0x35c/0x384) [<c0aeec98>] (i2c_device_probe) from [<c08776cc>] (really_probe+0x290/0x3e4) [<c087743c>] (really_probe) from [<c0877a2c>] (driver_probe_device+0x80/0x1c4) [<c08779ac>] (driver_probe_device) from [<c0877da8>] (__device_attach_driver+0x104/0x11c) [<c0877ca4>] (__device_attach_driver) from [<c0874dd8>] (bus_for_each_drv+0xa4/0xc8) [<c0874d34>] (bus_for_each_drv) from [<c08773b0>] (__device_attach+0xf0/0x15c) [<c08772c0>] (__device_attach) from [<c0877e24>] (device_initial_probe+0x1c/0x20) [<c0877e08>] (device_initial_probe) from [<c08762f4>] (bus_probe_device+0xdc/0xec) [<c0876218>] (bus_probe_device) from [<c0876a08>] (deferred_probe_work_func+0xa8/0xd4) [<c0876960>] (deferred_probe_work_func) from [<c01527c4>] (process_one_work+0x3dc/0x96c) [<c01523e8>] (process_one_work) from [<c01541e0>] (worker_thread+0x4ec/0x8bc) [<c0153cf4>] (worker_thread) from [<c015b238>] (kthread+0x230/0x240) [<c015b008>] (kthread) from [<c01010bc>] (ret_from_fork+0x14/0x38) Exception stack(0xcf743fb0 to 0xcf743ff8) 3fa0: 00000000 00000000 00000000 00000000 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 Allocated by task 132: kasan_kmalloc.part.1+0x58/0xf4 kasan_kmalloc+0x90/0xa4 kmem_cache_alloc_trace+0x90/0x2a0 __hwmon_device_register+0xbc/0xa7c hwmon_device_register_with_info+0xa0/0xa8 devm_hwmon_device_register_with_info+0x74/0xb4 lm90_probe+0x414/0x578 i2c_device_probe+0x35c/0x384 really_probe+0x290/0x3e4 driver_probe_device+0x80/0x1c4 __device_attach_driver+0x104/0x11c bus_for_each_drv+0xa4/0xc8 __device_attach+0xf0/0x15c device_initial_probe+0x1c/0x20 bus_probe_device+0xdc/0xec deferred_probe_work_func+0xa8/0xd4 process_one_work+0x3dc/0x96c worker_thread+0x4ec/0x8bc kthread+0x230/0x240 ret_from_fork+0x14/0x38 (null) Freed by task 132: __kasan_slab_free+0x12c/0x200 kasan_slab_free+0x14/0x18 kfree+0x90/0x294 hwmon_dev_release+0x1c/0x20 device_release+0x4c/0xe8 kobject_put+0xac/0x11c device_unregister+0x2c/0x30 __hwmon_device_register+0xa58/0xa7c hwmon_device_register_with_info+0xa0/0xa8 devm_hwmon_device_register_with_info+0x74/0xb4 lm90_probe+0x414/0x578 i2c_device_probe+0x35c/0x384 really_probe+0x290/0x3e4 driver_probe_device+0x80/0x1c4 __device_attach_driver+0x104/0x11c bus_for_each_drv+0xa4/0xc8 __device_attach+0xf0/0x15c device_initial_probe+0x1c/0x20 bus_probe_device+0xdc/0xec deferred_probe_work_func+0xa8/0xd4 process_one_work+0x3dc/0x96c worker_thread+0x4ec/0x8bc kthread+0x230/0x240 ret_from_fork+0x14/0x38 (null) Cc: <[email protected]> # v4.15+ Fixes: 47c332d ("hwmon: Deal with errors from the thermal subsystem") Signed-off-by: Dmitry Osipenko <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>

During system reboot or halt, with lockdep enabled: ================================ WARNING: inconsistent lock state 4.18.0-rc1-salvator-x-00002-g9203dbec90a68103 #41 Tainted: G W -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. reboot/2779 [HC0[0]:SC0[0]:HE1:SE1] takes: 0000000098ae4ad3 (&(&rchan->lock)->rlock){?.-.}, at: rcar_dmac_shutdown+0x58/0x6c {IN-HARDIRQ-W} state was registered at: lock_acquire+0x208/0x238 _raw_spin_lock+0x40/0x54 rcar_dmac_isr_channel+0x28/0x200 __handle_irq_event_percpu+0x1c0/0x3c8 handle_irq_event_percpu+0x34/0x88 handle_irq_event+0x48/0x78 handle_fasteoi_irq+0xc4/0x12c generic_handle_irq+0x18/0x2c __handle_domain_irq+0xa8/0xac gic_handle_irq+0x78/0xbc el1_irq+0xec/0x1c0 arch_cpu_idle+0xe8/0x1bc default_idle_call+0x2c/0x30 do_idle+0x144/0x234 cpu_startup_entry+0x20/0x24 rest_init+0x27c/0x290 start_kernel+0x430/0x45c irq event stamp: 12177 hardirqs last enabled at (12177): [<ffffff800881d804>] _raw_spin_unlock_irq+0x2c/0x4c hardirqs last disabled at (12176): [<ffffff800881d638>] _raw_spin_lock_irq+0x1c/0x60 softirqs last enabled at (11948): [<ffffff8008081da8>] __do_softirq+0x160/0x4ec softirqs last disabled at (11935): [<ffffff80080ec948>] irq_exit+0xa0/0xfc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&rchan->lock)->rlock); <Interrupt> lock(&(&rchan->lock)->rlock); *** DEADLOCK *** 3 locks held by reboot/2779: #0: 00000000bfabfa74 (reboot_mutex){+.+.}, at: sys_reboot+0xdc/0x208 #1: 00000000c75d8c3a (&dev->mutex){....}, at: device_shutdown+0xc8/0x1c4 #2: 00000000ebec58ec (&dev->mutex){....}, at: device_shutdown+0xd8/0x1c4 stack backtrace: CPU: 6 PID: 2779 Comm: reboot Tainted: G W 4.18.0-rc1-salvator-x-00002-g9203dbec90a68103 #41 Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ (DT) Call trace: dump_backtrace+0x0/0x148 show_stack+0x14/0x1c dump_stack+0xb0/0xf0 print_usage_bug.part.26+0x1c4/0x27c mark_lock+0x38c/0x610 __lock_acquire+0x3fc/0x14d4 lock_acquire+0x208/0x238 _raw_spin_lock+0x40/0x54 rcar_dmac_shutdown+0x58/0x6c platform_drv_shutdown+0x20/0x2c device_shutdown+0x160/0x1c4 kernel_restart_prepare+0x34/0x3c kernel_restart+0x14/0x5c sys_reboot+0x160/0x208 el0_svc_naked+0x30/0x34 rcar_dmac_stop_all_chan() takes the channel lock while stopping a channel, but does not disable interrupts, leading to a deadlock when a DMAC interrupt comes in. Before, the same code block was called from an interrupt handler, hence taking the spinlock was sufficient. Fix this by disabling local interrupts while taking the spinlock. Fixes: 9203dbe ("dmaengine: rcar-dmac: don't use DMAC error interrupt") Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Vinod Koul <[email protected]> (cherry picked from commit 45c9a60) Signed-off-by: Hiroyuki Yokoyama <[email protected]>

Fix kernel oops observed when an ext adv data is larger than 31 bytes. This can be reproduced by setting up an advertiser with advertisement larger than 31 bytes. The issue is not sensitive to the advertisement content. In particular, this was reproduced with an advertisement of 229 bytes filled with 'A'. See stack trace below. This is fixed by not catching ext_adv as legacy adv are only cached to be able to concatenate a scanable adv with its scan response before sending it up through mgmt. With ext_adv, this is no longer necessary. general protection fault: 0000 [#1] SMP PTI CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic xen-troops#41-Ubuntu Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020 Workqueue: hci0 hci_rx_work [bluetooth] RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth] Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75 RSP: 0018:ffffbc6a40493c70 EFLAGS: 00010286 RAX: 4141414141414141 RBX: 000000000000001b RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9903e76c100f RDI: ffff9904289d4b28 RBP: ffffbc6a40493c70 R08: 0000000093570362 R09: 0000000000000000 R10: 0000000000000000 R11: ffff9904344eae38 R12: ffff9904289d4000 R13: 0000000000000000 R14: 00000000ffffffa3 R15: ffff9903e76c100f FS: 0000000000000000(0000) GS:ffff990434580000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007feed125a000 CR3: 00000001b860a003 CR4: 00000000003606e0 Call Trace: process_adv_report+0x12e/0x560 [bluetooth] hci_le_meta_evt+0x7b2/0xba0 [bluetooth] hci_event_packet+0x1c29/0x2a90 [bluetooth] hci_rx_work+0x19b/0x360 [bluetooth] process_one_work+0x1eb/0x3b0 worker_thread+0x4d/0x400 kthread+0x104/0x140 Fixes: c215e93 ("Bluetooth: Process extended ADV report event") Reported-by: Andy Nguyen <[email protected]> Reported-by: Linus Torvalds <[email protected]> Reported-by: Balakrishna Godavarthi <[email protected]> Signed-off-by: Alain Michaud <[email protected]> Tested-by: Sonny Sasaka <[email protected]> Acked-by: Marcel Holtmann <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

NULL pointer dereference is triggered when calling thp split via debugfs on the system with offlined memory blocks. With debug option enabled, the following kernel messages are printed out: page:00000000467f4890 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x121c000 flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff) raw: 0017fffc00000000 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: unmovable page page:000000007d7ab72e is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1248! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 16 PID: 20964 Comm: bash Tainted: G I 6.0.0-rc3-foll-numa+ xen-troops#41 ... RIP: 0010:split_huge_pages_write+0xcf4/0xe30 This shows that page_to_nid() in page_zone() is unexpectedly called for an offlined memmap. Use pfn_to_online_page() to get struct page in PFN walker. Link: https://lkml.kernel.org/r/[email protected] Fixes: f1dd2cd ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e] Signed-off-by: Naoya Horiguchi <[email protected]> Co-developed-by: David Hildenbrand <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Yang Shi <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Muchun Song <[email protected]> Cc: <[email protected]> [5.10+] Signed-off-by: Andrew Morton <[email protected]>

andr2000 assigned iartemenko Aug 27, 2018

andr2000 requested review from lorc, arminn, otyshchenko1, aanisov and iartemenko August 27, 2018 05:58

Oleksandr Andrushchenko and others added 4 commits August 28, 2018 10:51

Revert "drm: rcar-du: Do not assume VSP has IOMMU"

d810680

This reverts commit 7ac2140.

Revert "drm: rcar-du: Allow importing non-contiguous dma-buf with VSP"

885bc5a

This reverts commit 1d9ad6c.

andr2000 force-pushed the pr_vsp_sgt_fix branch from 735063b to 91aae47 Compare August 28, 2018 07:51

iartemenko merged commit 6e878ac into xen-troops:master Aug 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix for paddr == 0 for VSP #41

Fix for paddr == 0 for VSP #41

Uh oh!

andr2000 commented Aug 27, 2018

Uh oh!

andr2000 commented Aug 28, 2018

Uh oh!

Uh oh!

Fix for paddr == 0 for VSP #41

Fix for paddr == 0 for VSP #41

Uh oh!

Conversation

andr2000 commented Aug 27, 2018

Uh oh!

andr2000 commented Aug 28, 2018

Uh oh!

Uh oh!