Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

andr2000
Copy link
Collaborator

This solution was pointed to us by Renesas, they are about to release/upstream it soon.
The solution is to revert [1] and apply [2].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/drivers/gpu/drm/rcar-du?h=v4.14/rcar-3.7.0&id=1d9ad6c074a831131dd39d0a27ed8bc11335de4a
[2] https://patchwork.freedesktop.org/series/33702/

This was tested on M3 with display backend running in DRM mode

@andr2000
Copy link
Collaborator Author

@iartemenko pls merge

Oleksandr Andrushchenko and others added 4 commits August 28, 2018 10:51
The DU DMA address space is limited to 32 bits, so the DMA coherent mask
should be set accordingly. The DMA mapping implementation will
transparently map high memory buffers to 32-bit addresses through an
IOMMU when present (or through bounce buffers otherwise, which isn't a
supported use case as performances would be terrible).

However, when sourcing frames from a VSP, the situation is more
complicated. The DU delegates all memory accesses to the VSP and doesn't
perform any DMA access by itself. Due to how the GEM CMA helpers are
structured buffers are still mapped to the DU device. They are later
mapped to the VSP as well to perform DMA access, through the IOMMU
connected to the VSP.

Setting the DMA coherent mask to 32 bits for the DU when using a VSP can
cause issues when importing a dma_buf. If the buffer is located above
the 32-bit address space, the DMA mapping implementation will try to map
it to the DU's DMA address space. As the DU has no IOMMU a bounce buffer
will be allocated, which in the best case will waste memory and in the
worst case will just fail.

To work around this issue, set the DMA coherent mask to the full 40-bit
address space for the DU. All dma-buf instances will be imported without
any restriction, and will be mapped to the VSP when preparing the
associated framebuffer.

Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed-by: Liviu Dudau <[email protected]>
When the DU sources its frames from a VSP, it performs no memory access
and thus has no requirements on imported dma-buf memory types. In
particular the DU could import a physically non-contiguous buffer that
would later be mapped contiguously through the VSP IOMMU.

This use case isn't supported at the moment as the GEM CMA helpers will
reject any non-contiguous buffer, and the DU isn't connected to an IOMMU
that can make the buffer contiguous for DMA. Fix this by implementing a
custom .gem_prime_import_sg_table() operation that accepts all imported
dma-buf regardless of the number of scatterlist entries.

Signed-off-by: Laurent Pinchart <[email protected]>
@iartemenko iartemenko merged commit 6e878ac into xen-troops:master Aug 28, 2018
andr2000 pushed a commit to andr2000/linux that referenced this pull request Nov 27, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 xen-troops#4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 xen-troops#5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 xen-troops#6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 xen-troops#7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 xen-troops#8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 xen-troops#9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 xen-troops#10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 xen-troops#11 [9a069e78]  200 bytes  insert_work at 23fc2e
 xen-troops#12 [9a069f40]  648 bytes  __queue_work at 2487c0
 xen-troops#13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 xen-troops#14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 xen-troops#15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 xen-troops#16 [9a06a3a]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 xen-troops#17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 xen-troops#18 [9a06a58]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 xen-troops#19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 xen-troops#20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 xen-troops#21 [9a06acf8]  320 bytes  schedule at 219e476
 xen-troops#22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 xen-troops#23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 xen-troops#24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 xen-troops#25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 xen-troops#26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 xen-troops#27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 xen-troops#28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 xen-troops#29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 xen-troops#30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 xen-troops#31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 xen-troops#32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 xen-troops#33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 xen-troops#34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 xen-troops#35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 xen-troops#36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 xen-troops#37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 xen-troops#38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 xen-troops#39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 xen-troops#40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 xen-troops#41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 xen-troops#42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 xen-troops#43 [9a06e57]  872 bytes  __writeback_single_inode at a20cb6
 xen-troops#44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 xen-troops#45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 xen-troops#46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 xen-troops#47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 xen-troops#48 [9a06f388]  536 bytes  wb_workfn at a28228
 xen-troops#49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 xen-troops#50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 xen-troops#51 [9a06fe40]  104 bytes  kthread at 26545a
 xen-troops#52 [9a06fea]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this pull request Nov 27, 2018
Fix double-free that happens when thermal zone setup fails, see KASAN log
below.

==================================================================
BUG: KASAN: double-free or invalid-free in __hwmon_device_register+0x5dc/0xa7c

CPU: 0 PID: 132 Comm: kworker/0:2 Tainted: G    B             4.19.0-rc8-next-20181016-00042-gb52cd80401e9-dirty xen-troops#41
Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
Workqueue: events deferred_probe_work_func
Backtrace:
[<c0110540>] (dump_backtrace) from [<c0110944>] (show_stack+0x20/0x24)
[<c0110924>] (show_stack) from [<c105cb08>] (dump_stack+0x9c/0xb0)
[<c105ca6c>] (dump_stack) from [<c02fdaec>] (print_address_description+0x68/0x250)
[<c02fda84>] (print_address_description) from [<c02fd4ac>] (kasan_report_invalid_free+0x68/0x88)
[<c02fd444>] (kasan_report_invalid_free) from [<c02fc85c>] (__kasan_slab_free+0x1f4/0x200)
[<c02fc668>] (__kasan_slab_free) from [<c02fd0c0>] (kasan_slab_free+0x14/0x18)
[<c02fd0ac>] (kasan_slab_free) from [<c02f9c6c>] (kfree+0x90/0x294)
[<c02f9bdc>] (kfree) from [<c0b41bbc>] (__hwmon_device_register+0x5dc/0xa7c)
[<c0b415e0>] (__hwmon_device_register) from [<c0b421e8>] (hwmon_device_register_with_info+0xa0/0xa8)
[<c0b42148>] (hwmon_device_register_with_info) from [<c0b42324>] (devm_hwmon_device_register_with_info+0x74/0xb4)
[<c0b422b0>] (devm_hwmon_device_register_with_info) from [<c0b4481c>] (lm90_probe+0x414/0x578)
[<c0b44408>] (lm90_probe) from [<c0aeeff4>] (i2c_device_probe+0x35c/0x384)
[<c0aeec98>] (i2c_device_probe) from [<c08776cc>] (really_probe+0x290/0x3e4)
[<c087743c>] (really_probe) from [<c0877a2c>] (driver_probe_device+0x80/0x1c4)
[<c08779ac>] (driver_probe_device) from [<c0877da8>] (__device_attach_driver+0x104/0x11c)
[<c0877ca4>] (__device_attach_driver) from [<c0874dd8>] (bus_for_each_drv+0xa4/0xc8)
[<c0874d34>] (bus_for_each_drv) from [<c08773b0>] (__device_attach+0xf0/0x15c)
[<c08772c0>] (__device_attach) from [<c0877e24>] (device_initial_probe+0x1c/0x20)
[<c0877e08>] (device_initial_probe) from [<c08762f4>] (bus_probe_device+0xdc/0xec)
[<c0876218>] (bus_probe_device) from [<c0876a08>] (deferred_probe_work_func+0xa8/0xd4)
[<c0876960>] (deferred_probe_work_func) from [<c01527c4>] (process_one_work+0x3dc/0x96c)
[<c01523e8>] (process_one_work) from [<c01541e0>] (worker_thread+0x4ec/0x8bc)
[<c0153cf4>] (worker_thread) from [<c015b238>] (kthread+0x230/0x240)
[<c015b008>] (kthread) from [<c01010bc>] (ret_from_fork+0x14/0x38)
Exception stack(0xcf743fb0 to 0xcf743ff8)
3fa0:                                     00000000 00000000 00000000 00000000
3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3fe0: 00000000 00000000 00000000 00000000 00000013 00000000

Allocated by task 132:
 kasan_kmalloc.part.1+0x58/0xf4
 kasan_kmalloc+0x90/0xa4
 kmem_cache_alloc_trace+0x90/0x2a0
 __hwmon_device_register+0xbc/0xa7c
 hwmon_device_register_with_info+0xa0/0xa8
 devm_hwmon_device_register_with_info+0x74/0xb4
 lm90_probe+0x414/0x578
 i2c_device_probe+0x35c/0x384
 really_probe+0x290/0x3e4
 driver_probe_device+0x80/0x1c4
 __device_attach_driver+0x104/0x11c
 bus_for_each_drv+0xa4/0xc8
 __device_attach+0xf0/0x15c
 device_initial_probe+0x1c/0x20
 bus_probe_device+0xdc/0xec
 deferred_probe_work_func+0xa8/0xd4
 process_one_work+0x3dc/0x96c
 worker_thread+0x4ec/0x8bc
 kthread+0x230/0x240
 ret_from_fork+0x14/0x38
   (null)

Freed by task 132:
 __kasan_slab_free+0x12c/0x200
 kasan_slab_free+0x14/0x18
 kfree+0x90/0x294
 hwmon_dev_release+0x1c/0x20
 device_release+0x4c/0xe8
 kobject_put+0xac/0x11c
 device_unregister+0x2c/0x30
 __hwmon_device_register+0xa58/0xa7c
 hwmon_device_register_with_info+0xa0/0xa8
 devm_hwmon_device_register_with_info+0x74/0xb4
 lm90_probe+0x414/0x578
 i2c_device_probe+0x35c/0x384
 really_probe+0x290/0x3e4
 driver_probe_device+0x80/0x1c4
 __device_attach_driver+0x104/0x11c
 bus_for_each_drv+0xa4/0xc8
 __device_attach+0xf0/0x15c
 device_initial_probe+0x1c/0x20
 bus_probe_device+0xdc/0xec
 deferred_probe_work_func+0xa8/0xd4
 process_one_work+0x3dc/0x96c
 worker_thread+0x4ec/0x8bc
 kthread+0x230/0x240
 ret_from_fork+0x14/0x38
   (null)

Cc: <[email protected]> # v4.15+
Fixes: 47c332d ("hwmon: Deal with errors from the thermal subsystem")
Signed-off-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
otyshchenko1 pushed a commit that referenced this pull request May 7, 2019
During system reboot or halt, with lockdep enabled:

    ================================
    WARNING: inconsistent lock state
    4.18.0-rc1-salvator-x-00002-g9203dbec90a68103 #41 Tainted: G        W
    --------------------------------
    inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
    reboot/2779 [HC0[0]:SC0[0]:HE1:SE1] takes:
    0000000098ae4ad3 (&(&rchan->lock)->rlock){?.-.}, at: rcar_dmac_shutdown+0x58/0x6c
    {IN-HARDIRQ-W} state was registered at:
      lock_acquire+0x208/0x238
      _raw_spin_lock+0x40/0x54
      rcar_dmac_isr_channel+0x28/0x200
      __handle_irq_event_percpu+0x1c0/0x3c8
      handle_irq_event_percpu+0x34/0x88
      handle_irq_event+0x48/0x78
      handle_fasteoi_irq+0xc4/0x12c
      generic_handle_irq+0x18/0x2c
      __handle_domain_irq+0xa8/0xac
      gic_handle_irq+0x78/0xbc
      el1_irq+0xec/0x1c0
      arch_cpu_idle+0xe8/0x1bc
      default_idle_call+0x2c/0x30
      do_idle+0x144/0x234
      cpu_startup_entry+0x20/0x24
      rest_init+0x27c/0x290
      start_kernel+0x430/0x45c
    irq event stamp: 12177
    hardirqs last  enabled at (12177): [<ffffff800881d804>] _raw_spin_unlock_irq+0x2c/0x4c
    hardirqs last disabled at (12176): [<ffffff800881d638>] _raw_spin_lock_irq+0x1c/0x60
    softirqs last  enabled at (11948): [<ffffff8008081da8>] __do_softirq+0x160/0x4ec
    softirqs last disabled at (11935): [<ffffff80080ec948>] irq_exit+0xa0/0xfc

    other info that might help us debug this:
     Possible unsafe locking scenario:

	   CPU0
	   ----
      lock(&(&rchan->lock)->rlock);
      <Interrupt>
	lock(&(&rchan->lock)->rlock);

     *** DEADLOCK ***

    3 locks held by reboot/2779:
     #0: 00000000bfabfa74 (reboot_mutex){+.+.}, at: sys_reboot+0xdc/0x208
     #1: 00000000c75d8c3a (&dev->mutex){....}, at: device_shutdown+0xc8/0x1c4
     #2: 00000000ebec58ec (&dev->mutex){....}, at: device_shutdown+0xd8/0x1c4

    stack backtrace:
    CPU: 6 PID: 2779 Comm: reboot Tainted: G        W         4.18.0-rc1-salvator-x-00002-g9203dbec90a68103 #41
    Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ (DT)
    Call trace:
     dump_backtrace+0x0/0x148
     show_stack+0x14/0x1c
     dump_stack+0xb0/0xf0
     print_usage_bug.part.26+0x1c4/0x27c
     mark_lock+0x38c/0x610
     __lock_acquire+0x3fc/0x14d4
     lock_acquire+0x208/0x238
     _raw_spin_lock+0x40/0x54
     rcar_dmac_shutdown+0x58/0x6c
     platform_drv_shutdown+0x20/0x2c
     device_shutdown+0x160/0x1c4
     kernel_restart_prepare+0x34/0x3c
     kernel_restart+0x14/0x5c
     sys_reboot+0x160/0x208
     el0_svc_naked+0x30/0x34

rcar_dmac_stop_all_chan() takes the channel lock while stopping a
channel, but does not disable interrupts, leading to a deadlock when a
DMAC interrupt comes in.  Before, the same code block was called from an
interrupt handler, hence taking the spinlock was sufficient.

Fix this by disabling local interrupts while taking the spinlock.

Fixes: 9203dbe ("dmaengine: rcar-dmac: don't use DMAC error interrupt")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
(cherry picked from commit 45c9a60)
Signed-off-by: Hiroyuki Yokoyama <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this pull request Aug 13, 2020
Fix kernel oops observed when an ext adv data is larger than 31 bytes.

This can be reproduced by setting up an advertiser with advertisement
larger than 31 bytes.  The issue is not sensitive to the advertisement
content.  In particular, this was reproduced with an advertisement of
229 bytes filled with 'A'.  See stack trace below.

This is fixed by not catching ext_adv as legacy adv are only cached to
be able to concatenate a scanable adv with its scan response before
sending it up through mgmt.

With ext_adv, this is no longer necessary.

  general protection fault: 0000 [#1] SMP PTI
  CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic xen-troops#41-Ubuntu
  Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020
  Workqueue: hci0 hci_rx_work [bluetooth]
  RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth]
  Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75
  RSP: 0018:ffffbc6a40493c70 EFLAGS: 00010286
  RAX: 4141414141414141 RBX: 000000000000001b RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffff9903e76c100f RDI: ffff9904289d4b28
  RBP: ffffbc6a40493c70 R08: 0000000093570362 R09: 0000000000000000
  R10: 0000000000000000 R11: ffff9904344eae38 R12: ffff9904289d4000
  R13: 0000000000000000 R14: 00000000ffffffa3 R15: ffff9903e76c100f
  FS: 0000000000000000(0000) GS:ffff990434580000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007feed125a000 CR3: 00000001b860a003 CR4: 00000000003606e0
  Call Trace:
    process_adv_report+0x12e/0x560 [bluetooth]
    hci_le_meta_evt+0x7b2/0xba0 [bluetooth]
    hci_event_packet+0x1c29/0x2a90 [bluetooth]
    hci_rx_work+0x19b/0x360 [bluetooth]
    process_one_work+0x1eb/0x3b0
    worker_thread+0x4d/0x400
    kthread+0x104/0x140

Fixes: c215e93 ("Bluetooth: Process extended ADV report event")
Reported-by: Andy Nguyen <[email protected]>
Reported-by: Linus Torvalds <[email protected]>
Reported-by: Balakrishna Godavarthi <[email protected]>
Signed-off-by: Alain Michaud <[email protected]>
Tested-by: Sonny Sasaka <[email protected]>
Acked-by: Marcel Holtmann <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this pull request Oct 24, 2022
NULL pointer dereference is triggered when calling thp split via debugfs
on the system with offlined memory blocks.  With debug option enabled, the
following kernel messages are printed out:

  page:00000000467f4890 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x121c000
  flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff)
  raw: 0017fffc00000000 0000000000000000 dead000000000122 0000000000000000
  raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:000000007d7ab72e is uninitialized and poisoned
  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
  ------------[ cut here ]------------
  kernel BUG at include/linux/mm.h:1248!
  invalid opcode: 0000 [#1] PREEMPT SMP PTI
  CPU: 16 PID: 20964 Comm: bash Tainted: G          I        6.0.0-rc3-foll-numa+ xen-troops#41
  ...
  RIP: 0010:split_huge_pages_write+0xcf4/0xe30

This shows that page_to_nid() in page_zone() is unexpectedly called for an
offlined memmap.

Use pfn_to_online_page() to get struct page in PFN walker.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: f1dd2cd ("mm, memory_hotplug: do not associate hotadded memory to zones until online")      [visible after d0dc12e]
Signed-off-by: Naoya Horiguchi <[email protected]>
Co-developed-by: David Hildenbrand <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: <[email protected]>	[5.10+]
Signed-off-by: Andrew Morton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants