Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Streaming installation aborts on bad network conditions. #1517

@BubuOT

Description

@BubuOT

Describe the bug
When doing a streaming installation over an unreliable mobile network connection we sometimes get a fatal error "Installation error: Failed updating slot rootfs.1: Failed splicing data: Error reading from file: Input/output error" and the installation aborts.

Background information
buildroot with rauc v1.11.3 (recently upgraded to 1.12 but the logs are still from the old version)

To Reproduce
Very unreliable to reproduce.
Start a streaming installation over a fleet of 2k devices connected via mobile network
~10-50 of these will fail and need to be retried, sometimes >5 times, increasing the traffic consumption of the update drastically. Some also never manage to install the update until mobile connectivity improves.

Expected behavior
Rauc keeps continuing the installation over the unreliable network and eventually succeeds.

Logs
kernel:

Jun 03 16:47:57 othermo-8E56A652 kernel: nbd0: detected capacity change from 0 to 209416
Jun 03 16:47:57 othermo-8E56A652 kernel: device-mapper: ioctl: 4.45.0-ioctl (2021-03-22) initialised: [email protected]
Jun 03 16:47:58 othermo-8E56A652 kernel: device-mapper: verity: sha256 using implementation "sha256-generic"
Jun 03 16:50:26 othermo-8E56A652 kernel: brcmfmac: brcmf_cfg80211_set_power_mgmt: power save enabled
Jun 03 16:57:21 othermo-8E56A652 kernel: brcmfmac: brcmf_cfg80211_set_power_mgmt: power save enabled
Jun 03 17:03:06 othermo-8E56A652 kernel: INFO: task kcompactd0:43 blocked for more than 120 seconds.
Jun 03 17:03:06 othermo-8E56A652 kernel:       Tainted: G         C        5.15.84-v8 #1
Jun 03 17:03:06 othermo-8E56A652 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 03 17:03:06 othermo-8E56A652 kernel: task:kcompactd0      state:D stack:    0 pid:   43 ppid:     2 flags:0x00000008
Jun 03 17:03:06 othermo-8E56A652 kernel: Call trace:
Jun 03 17:03:06 othermo-8E56A652 kernel:  __switch_to+0x110/0x170
Jun 03 17:03:06 othermo-8E56A652 kernel:  __schedule+0x314/0x870
Jun 03 17:03:06 othermo-8E56A652 kernel:  schedule+0x6c/0x130
Jun 03 17:03:06 othermo-8E56A652 kernel:  io_schedule+0x44/0x70
Jun 03 17:03:06 othermo-8E56A652 kernel:  wait_on_page_bit_common+0x114/0x310
Jun 03 17:03:06 othermo-8E56A652 kernel:  __lock_page+0x5c/0x80
Jun 03 17:03:06 othermo-8E56A652 kernel:  migrate_pages+0x150/0xa70
Jun 03 17:03:06 othermo-8E56A652 kernel:  compact_zone+0x59c/0xe40
Jun 03 17:03:06 othermo-8E56A652 kernel:  proactive_compact_node+0x78/0xb0
Jun 03 17:03:06 othermo-8E56A652 kernel:  kcompactd+0x1a0/0x3f0
Jun 03 17:03:06 othermo-8E56A652 kernel:  kthread+0x140/0x160
Jun 03 17:03:06 othermo-8E56A652 kernel:  ret_from_fork+0x10/0x20
Jun 03 17:04:16 othermo-8E56A652 kernel: brcmfmac: brcmf_cfg80211_set_power_mgmt: power save enabled
Jun 03 17:05:07 othermo-8E56A652 kernel: INFO: task kcompactd0:43 blocked for more than 241 seconds.
Jun 03 17:05:07 othermo-8E56A652 kernel:       Tainted: G         C        5.15.84-v8 #1
Jun 03 17:05:07 othermo-8E56A652 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 03 17:05:07 othermo-8E56A652 kernel: task:kcompactd0      state:D stack:    0 pid:   43 ppid:     2 flags:0x00000008
Jun 03 17:05:07 othermo-8E56A652 kernel: Call trace:
Jun 03 17:05:07 othermo-8E56A652 kernel:  __switch_to+0x110/0x170
Jun 03 17:05:07 othermo-8E56A652 kernel:  __schedule+0x314/0x870
Jun 03 17:05:07 othermo-8E56A652 kernel:  schedule+0x6c/0x130
Jun 03 17:05:07 othermo-8E56A652 kernel:  io_schedule+0x44/0x70
Jun 03 17:05:07 othermo-8E56A652 kernel:  wait_on_page_bit_common+0x114/0x310
Jun 03 17:05:07 othermo-8E56A652 kernel:  __lock_page+0x5c/0x80
Jun 03 17:05:07 othermo-8E56A652 kernel:  migrate_pages+0x150/0xa70
Jun 03 17:05:07 othermo-8E56A652 kernel:  compact_zone+0x59c/0xe40
Jun 03 17:05:07 othermo-8E56A652 kernel:  proactive_compact_node+0x78/0xb0
Jun 03 17:05:07 othermo-8E56A652 kernel:  kcompactd+0x1a0/0x3f0
Jun 03 17:05:07 othermo-8E56A652 kernel:  kthread+0x140/0x160
Jun 03 17:05:07 othermo-8E56A652 kernel:  ret_from_fork+0x10/0x20
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: Connection timed out, retrying (1/1 alive)
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: Receive control failed (result -32)
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: Dead connection, failed to find a fallback
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: shutting down sockets
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 73824 op 0x0:(READ) flags 0x800 phys_seg 8 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Failed to read block 0x240c439: -5
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read data cache entry [240c439]
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read page, block 240c439, size 7263
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read data cache entry [240c439]
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read page, block 240c439, size 7263
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read data cache entry [240c439]
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read page, block 240c439, size 7263
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 208016 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 73824 op 0x0:(READ) flags 0x800 phys_seg 8 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Failed to read block 0x240c439: -5
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 73824 op 0x0:(READ) flags 0x800 phys_seg 8 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Failed to read block 0x240c439: -5
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 207360 op 0x0:(READ) flags 0x4000 phys_seg 32 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 1888 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 207616 op 0x0:(READ) flags 0x0 phys_seg 19 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Failed to read block 0xec5f4: -5
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 207784 op 0x0:(READ) flags 0x0 phys_seg 11 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read data cache entry [ec5f4]
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read page, block ec5f4, size 5ad
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read data cache entry [ec5f4]
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read page, block ec5f4, size 5ad
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read data cache entry [ec5f4]
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Unable to read page, block ec5f4, size 5ad
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 1888 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Failed to read block 0xec5f4: -5
Jun 03 17:05:52 othermo-8E56A652 kernel: blk_update_request: I/O error, dev nbd0, sector 1888 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
Jun 03 17:05:52 othermo-8E56A652 kernel: SQUASHFS error: Failed to read block 0xec5f4: -5
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: NBD_DISCONNECT
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: Send disconnect failed -32
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: NBD_DISCONNECT
Jun 03 17:05:52 othermo-8E56A652 kernel: block nbd0: Send disconnect failed -32

rauc:

Jun 03 16:47:56 othermo-8E56A652 rauc[245]: input bundle: <REDACTED>
Jun 03 16:47:56 othermo-8E56A652 rauc[245]: Active slot bootname: A
Jun 03 16:47:56 othermo-8E56A652 rauc[245]: installing <REDACTED>: started
Jun 03 16:47:56 othermo-8E56A652 rauc[245]: Installation 2ffa34f8-1450-4484-8e0a-a3cbe5d04910 started
Jun 03 16:47:56 othermo-8E56A652 rauc[245]: installing <REDACTED>: Checking and mounting bundle...
Jun 03 16:47:56 othermo-8E56A652 rauc[245]: Remote URI detected, streaming bundle...
Jun 03 16:47:56 othermo-8E56A652 rauc[245]: sending: {'url': <REDACTED>, 'headers': <['Authorization: <REDACTED> }
Jun 03 16:47:56 othermo-8E56A652 rauc-nbd[460941]: running as UID 65534, GID 65534
Jun 03 16:47:56 othermo-8E56A652 rauc-nbd[460941]: received: {'url': <REDACTED>}
Jun 03 16:47:56 othermo-8E56A652 rauc-nbd[460941]: configuring for URL: <REDACTED>
Jun 03 16:47:57 othermo-8E56A652 rauc-nbd[460941]: total size 107223365
Jun 03 16:47:57 othermo-8E56A652 rauc-nbd[460941]: server date 1717426077
Jun 03 16:47:57 othermo-8E56A652 rauc-nbd[460941]: file date 1716468420
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: received total size 107223365
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: received current time 1717426077
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: received modified time 1716468420
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: nbd server started
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: Reading bundle: <REDACTED>
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: Verifying bundle signature... 
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: Verified inline signature by 'C = DE, O = othermo.de, OU = othermo Software Certificate Authority, CN = othermo Software CA'
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: Mounting bundle '<REDACTED>'
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: configuring nbd device
Jun 03 16:47:57 othermo-8E56A652 rauc[245]: setup done for /dev/nbd0
Jun 03 16:48:02 othermo-8E56A652 rauc[245]: Configured dm-verity device '/dev/dm-0'
Jun 03 16:48:11 othermo-8E56A652 rauc[245]: Checking image type for slot type: boot-mbr-switch
Jun 03 16:48:11 othermo-8E56A652 rauc[245]: Image detected as type: *.vfat
Jun 03 16:48:11 othermo-8E56A652 rauc[245]: Checking image type for slot type: ext4
Jun 03 16:48:11 othermo-8E56A652 rauc[245]: Image detected as type: *.ext4
Jun 03 16:48:12 othermo-8E56A652 rauc[245]: Marking target slot rootfs.1 as non-bootable...
Jun 03 16:48:12 othermo-8E56A652 rauc[245]: installing <REDACTED>: Updating slots...
Jun 03 16:48:12 othermo-8E56A652 rauc[245]: installing <REDACTED>: Checking slot bootloader.0
Jun 03 16:48:12 othermo-8E56A652 rauc[245]: Updating /dev/mmcblk0 with /run/rauc/bundle/boot.vfat
Jun 03 16:48:12 othermo-8E56A652 rauc[245]: installing <REDACTED>: Updating slot bootloader.0
Jun 03 16:48:12 othermo-8E56A652 rauc[245]: Found inactive (first) half of boot partition region (pos. 4194304B, size 33554432B)
Jun 03 16:48:12 othermo-8E56A652 rauc[245]: Clearing inactive (first) half of boot partition region on /dev/mmcblk0
Jun 03 16:48:14 othermo-8E56A652 rauc[245]: Write image to inactive (first) half of boot partition region on /dev/mmcblk0
Jun 03 16:49:02 othermo-8E56A652 rauc[245]: Setting first half of boot partition region active in MBR
Jun 03 16:49:02 othermo-8E56A652 rauc[245]: installing <REDACTED>: Updating slot bootloader.0 status
Jun 03 16:49:02 othermo-8E56A652 rauc[245]: installing <REDACTED>: Updating slot bootloader.0 done
Jun 03 16:49:02 othermo-8E56A652 rauc[245]: installing <REDACTED>: Checking slot rootfs.1
Jun 03 16:49:02 othermo-8E56A652 rauc[245]: Updating /dev/mmcblk0p6 with /run/rauc/bundle/rootfs.ext4
Jun 03 16:49:02 othermo-8E56A652 rauc[245]: installing <REDACTED>: Updating slot rootfs.1
Jun 03 17:05:52 othermo-8E56A652 rauc-nbd[460941]: exiting nbd server
Jun 03 17:05:52 othermo-8E56A652 rauc-nbd[460941]: nbd server failed with: failed to read request from client: Unexpected end of file
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: Continuing after adaptive mode error: no chunk with required hash [3483234d4d01c46adfb1142f2bb6e0d65162a7caa2f370a2bfcac324f2651ad4] found
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: opening slot device /dev/mmcblk0p6
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: writing data to device /dev/mmcblk0p6
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: r_nbd_remove_device
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: nbd server stopping
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: Installation error: Failed updating slot rootfs.1: Failed splicing data: Error reading from file: Input/output error
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: installing <REDACTED>: Installation error: Failed updating slot rootfs.1: Failed splicing data: Error reading from file: Input/output error
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: installing <REDACTED>: finished
Jun 03 17:05:52 othermo-8E56A652 rauc[245]: installing `<REDACTED>` failed: 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions