efi: Get the correct esp on multipath when install#1006
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds logic to determine the partition number for multipath devices, as they don't have a partition attribute in sysfs. The approach is to extract trailing digits from the device name.
My review identifies a bug in the implementation where multi-digit partition numbers would be reversed (e.g., '12' becomes '21'). I've provided a suggestion to fix this and simplify the code.
Additionally, I've recommended replacing a debug assert! with proper error handling to ensure the function fails gracefully if a partition number cannot be determined, which is more robust for release builds.
b380794 to
afcd224
Compare
|
Update testing result: |
afcd224 to
49912bd
Compare
| .with_context(|| format!("Failed to read {partition_path}"))?; | ||
| } else { | ||
| // On multipath, there is no partition attribute | ||
| // get the part number from the device path "/dev/dm-2" |
There was a problem hiding this comment.
This doesn't seem right to me; the device mapper number here has nothing to do with partitions. It's just a unique identifier for the virtual device.
Crucially, DM is a Linux level concept and doesn't exist for EFI.
I think what we need to do here instead is "peel" through the multipath to find the block device(s) backing it, and then pick one of those.
There was a problem hiding this comment.
Thank you for the pointer! Sorry that I misunderstood this, you are right, dm is just a device-mapper device, which might be LVM.
I think what we need to do here instead is "peel" through the multipath to find the block device(s) backing it, and then pick one of those.
Look at more about this, not sure if my understanding is correct. Do we need to check if it is multipath env?
[root@cosa-devsh core]# realpath /dev/mapper/0x27ba642d851ba0f0
/dev/dm-0
# ls /sys/block/dm-0/slaves/
sda sdb
### find esp device from sda get '2'
# efibootmgr -c --disk /dev/sda --part 2 --loader \\EFI\\fedora\\shimx64.efi --label "Test"
There was a problem hiding this comment.
Updated, could you help to review again, thank you!
There was a problem hiding this comment.
Hmm...OK so this is an interesting topic. With say RAID or LVM, there is no strict correspondence between the dm-<N> device and any physical partitions.
Backing up here for a second, do we actually need the --part N argument? Shouldn't the firmware discover the partition automatically? Can we test omitting it in this case?
But assuming we do need it...now specifically with multipath it does create a 1:1 correspondence between partitions and device-mapper devices it seems. I guess we can rely on that, and pick up the last number from e.g. /dev/mapper/0xbae1534ba0c9352a4.
There was a problem hiding this comment.
Backing up here for a second, do we actually need the
--part Nargument? Shouldn't the firmware discover the partition automatically? Can we test omitting it in this case?
I think we still need this. Test without --part, it defaults using first part which is not we want.
[root@cosa-devsh core]# efibootmgr -c --disk /dev/mapper/0x9294ba15777e8eac --loader \\EFI\\fedora\\shimx64.efi --label "Test1"
BootCurrent: 0002
Timeout: 0 seconds
BootOrder: 0006,0005,0002,0003,0000,0001,0004
Boot0000* BootManagerMenuApp FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(eec25bdc-67f2-4d95-b1d5-f81b2039d11d)
Boot0001* EFI Firmware Setup FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
Boot0002* UEFI NVME VirtualMultipath PciRoot(0x0)/Pci(0x2,0x0)/SCSI(0,0){auto_created_boot_option}
Boot0003* UEFI NVME VirtualMultipath 2 PciRoot(0x0)/Pci(0x3,0x0)/SCSI(0,0){auto_created_boot_option}
Boot0004* EFI Internal Shell FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)
Boot0005* Fedora HD(2,GPT,e602f792-6a66-4788-95f4-e26774bf5bdc,0x1000,0x3f800)/\EFI\fedora\shimx64.efi
Boot0006* Test1 HD(1,GPT,316aff79-2338-4b06-b8e0-422dd3045195,0x800,0x800)/\EFI\fedora\shimx64.efi
But assuming we do need it...now specifically with multipath it does create a 1:1 correspondence between partitions and device-mapper devices it seems. I guess we can rely on that, and pick up the last number
Is it safe to use mpath device /dev/mapper/0x9294ba15777e8eac instead of backing devices /dev/sda? And in my testing, they are the same. But firstly we need to check if it is multipath.
[root@cosa-devsh core]# efibootmgr -c --disk /dev/mapper/0x9294ba15777e8eac --part 2 --loader \\EFI\\fedora\\shimx64.efi --label "Test1"
[root@cosa-devsh core]# efibootmgr -c --disk /dev/sda --part 2 --loader \\EFI\\fedora\\shimx64.efi --label "Test2"
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0007,0006,0005,0002,0003,0000,0001,0004
Boot0005* Fedora HD(2,GPT,e602f792-6a66-4788-95f4-e26774bf5bdc,0x1000,0x3f800)/\EFI\fedora\shimx64.efi
Boot0006* Test1 HD(2,GPT,e602f792-6a66-4788-95f4-e26774bf5bdc,0x1000,0x3f800)/\EFI\fedora\shimx64.efi
Boot0007* Test2 HD(2,GPT,e602f792-6a66-4788-95f4-e26774bf5bdc,0x1000,0x3f800)/\EFI\fedora\shimx64.efi
Check the mpath
[root@cosa-devsh core]# realpath /dev/mapper/0x9294ba15777e8eac
/dev/dm-0
[root@cosa-devsh core]# cat /sys/block/dm-0/dm/uuid
mpath-0x9294ba15777e8eac
ea75319 to
3f9828e
Compare
4f97d23 to
1902fdb
Compare
| } | ||
|
|
||
| /// Get backing devices for /dev/mapper/xxx | ||
| pub fn get_backing_devices(device: &str) -> Result<Vec<String>> { |
There was a problem hiding this comment.
We already have this in get_devices() right?
There was a problem hiding this comment.
Yes, get_devices() will call bootc_internal_blockdev::find_parent_devices and only get mpath (from blockdev.rs), so is /dev/mapper/xxx. But what we want is backing disks instead of mpath, is this correct?
There was a problem hiding this comment.
Here I am using like the following to get the backing disks.
[root@cosa-devsh core]# realpath /dev/mapper/0x27ba642d851ba0f0
/dev/dm-0
# ls /sys/block/dm-0/slaves/
sda sdb
There was a problem hiding this comment.
I think we can call it recursively until we find a partitioned device right?
There was a problem hiding this comment.
Basically lsblk -J should already be doing all the scraping of sysfs etc. for us
There was a problem hiding this comment.
Check that can not find backing devices from (sorry for the output format, not sure why the code block not work)
# lsblk -J -b -O /dev/mapper/0xc7d72158e865d8bd
output
``` { "blockdevices": [ { "alignment": 0, "id-link": "scsi-0xc7d72158e865d8bd", "id": "0xc7d72158e865d8bd", "disc-aln": 0, "dax": false, "disc-gran": 4096, "disk-seq": 3, "disc-max": 1073741824, "disc-zero": false, "fsavail": null, "fsroots": [ null ], "fssize": null, "fstype": null, "fsused": null, "fsuse%": null, "fsver": null, "group": "disk", "hctl": null, "hotplug": false, "kname": "dm-0", "label": null, "log-sec": 512, "maj:min": "252:0", "maj": "252", "min": "0", "min-io": 512, "mode": "brw-rw----", "model": null, "mq": " 1", "name": "0xc7d72158e865d8bd", "opt-io": 0, "owner": "root", "partflags": null, "partlabel": null, "partn": null, "parttype": null, "parttypename": null, "partuuid": null, "path": "/dev/mapper/0xc7d72158e865d8bd", "phy-sec": 512, "pkname": null, "pttype": "gpt", "ptuuid": "c234431c-b917-4405-96a3-32cadd9a1c60", "ra": 128, "rand": false, "rev": null, "rm": false, "ro": false, "rota": true, "rq-size": 256, "sched": "mq-deadline", "serial": null, "size": 10737418240, "start": null, "state": "running", "subsystems": "block", "mountpoint": null, "mountpoints": [ null ], "tran": null, "type": "mpath", "uuid": null, "vendor": null, "wsame": 0, "wwn": null, "zoned": "none", "zone-sz": 0, "zone-wgran": 0, "zone-app": 0, "zone-nr": 0, "zone-omax": 0, "zone-amax": 0, "children": [ { "alignment": 0, "id-link": "dm-name-0xc7d72158e865d8bd1", "id": "name-0xc7d72158e865d8bd1", "disc-aln": 0, "dax": false, "disc-gran": 4096, "disk-seq": 4, "disc-max": 1073741824, "disc-zero": false, "fsavail": null, "fsroots": [ null ], "fssize": null, "fstype": null, "fsused": null, "fsuse%": null, "fsver": null, "group": "disk", "hctl": null, "hotplug": false, "kname": "dm-1", "label": null, "log-sec": 512, "maj:min": "252:1", "maj": "252", "min": "1", "min-io": 512, "mode": "brw-rw----", "model": null, "mq": "1", "name": "0xc7d72158e865d8bd1", "opt-io": 0, "owner": "root", "partflags": null, "partlabel": "BIOS-BOOT", "partn": 1, "parttype": "21686148-6449-6e6f-744e-656564454649", "parttypename": null, "partuuid": "316aff79-2338-4b06-b8e0-422dd3045195", "path": "/dev/mapper/0xc7d72158e865d8bd1", "phy-sec": 512, "pkname": "dm-0", "pttype": null, "ptuuid": null, "ra": 128, "rand": false, "rev": null, "rm": false, "ro": false, "rota": true, "rq-size": null, "sched": null, "serial": null, "size": 1048576, "start": null, "state": "running", "subsystems": "block", "mountpoint": null, "mountpoints": [ null ], "tran": null, "type": "part", "uuid": null, "vendor": null, "wsame": 0, "wwn": null, "zoned": "none", "zone-sz": 0, "zone-wgran": 0, "zone-app": 0, "zone-nr": 0, "zone-omax": 0, "zone-amax": 0 },{ "alignment": 0, "id-link": "dm-name-0xc7d72158e865d8bd2", "id": "name-0xc7d72158e865d8bd2", "disc-aln": 0, "dax": false, "disc-gran": 4096, "disk-seq": 5, "disc-max": 1073741824, "disc-zero": false, "fsavail": null, "fsroots": [ null ], "fssize": null, "fstype": "vfat", "fsused": null, "fsuse%": null, "fsver": "FAT16", "group": "disk", "hctl": null, "hotplug": false, "kname": "dm-2", "label": "EFI-SYSTEM", "log-sec": 512, "maj:min": "252:2", "maj": "252", "min": "2", "min-io": 512, "mode": "brw-rw----", "model": null, "mq": "1", "name": "0xc7d72158e865d8bd2", "opt-io": 0, "owner": "root", "partflags": null, "partlabel": "EFI-SYSTEM", "partn": 2, "parttype": "c12a7328-f81f-11d2-ba4b-00a0c93ec93b", "parttypename": null, "partuuid": "e602f792-6a66-4788-95f4-e26774bf5bdc", "path": "/dev/mapper/0xc7d72158e865d8bd2", "phy-sec": 512, "pkname": "dm-0", "pttype": null, "ptuuid": null, "ra": 128, "rand": false, "rev": null, "rm": false, "ro": false, "rota": true, "rq-size": null, "sched": null, "serial": null, "size": 133169152, "start": null, "state": "running", "subsystems": "block", "mountpoint": null, "mountpoints": [ null ], "tran": null, "type": "part", "uuid": "7B77-95E7", "vendor": null, "wsame": 0, "wwn": null, "zoned": "none", "zone-sz": 0, "zone-wgran": 0, "zone-app": 0, "zone-nr": 0, "zone-omax": 0, "zone-amax": 0 },{ "alignment": 0, "id-link": "dm-name-0xc7d72158e865d8bd3", "id": "name-0xc7d72158e865d8bd3", "disc-aln": 0, "dax": false, "disc-gran": 4096, "disk-seq": 6, "disc-max": 1073741824, "disc-zero": false, "fsavail": 193817600, "fsroots": [ "/" ], "fssize": 366869504, "fstype": "ext4", "fsused": 148725760, "fsuse%": "41%", "fsver": "1.0", "group": "disk", "hctl": null, "hotplug": false, "kname": "dm-3", "label": "boot", "log-sec": 512, "maj:min": "252:3", "maj": "252", "min": "3", "min-io": 512, "mode": "brw-rw----", "model": null, "mq": "1", "name": "0xc7d72158e865d8bd3", "opt-io": 0, "owner": "root", "partflags": null, "partlabel": "boot", "partn": 3, "parttype": "0fc63daf-8483-4772-8e79-3d69d8477de4", "parttypename": null, "partuuid": "71de3c16-8422-4bfe-b76f-35f8a3d9e170", "path": "/dev/mapper/0xc7d72158e865d8bd3", "phy-sec": 512, "pkname": "dm-0", "pttype": null, "ptuuid": null, "ra": 128, "rand": false, "rev": null, "rm": false, "ro": false, "rota": true, "rq-size": null, "sched": null, "serial": null, "size": 402653184, "start": null, "state": "running", "subsystems": "block", "mountpoint": "/boot", "mountpoints": [ "/boot" ], "tran": null, "type": "part", "uuid": "3e1e6461-1a34-45fd-8374-3cc679fc4950", "vendor": null, "wsame": 0, "wwn": null, "zoned": "none", "zone-sz": 0, "zone-wgran": 0, "zone-app": 0, "zone-nr": 0, "zone-omax": 0, "zone-amax": 0 },{ "alignment": 0, "id-link": "dm-name-0xc7d72158e865d8bd4", "id": "name-0xc7d72158e865d8bd4", "disc-aln": 0, "dax": false, "disc-gran": 4096, "disk-seq": 7, "disc-max": 1073741824, "disc-zero": false, "fsavail": 7882080256, "fsroots": [ "/ostree/deploy/fedora-coreos/var", "/ostree/deploy/fedora-coreos/var", "/", "/ostree/deploy/fedora-coreos/deploy/ff4042dc0cc8157c81ff044bfd52bafb4dac54f39641e399ad1ca1fee7adde5c.0/etc" ], "fssize": 10131341312, "fstype": "xfs", "fsused": 2249261056, "fsuse%": "22%", "fsver": null, "group": "disk", "hctl": null, "hotplug": false, "kname": "dm-4", "label": "root", "log-sec": 512, "maj:min": "252:4", "maj": "252", "min": "4", "min-io": 512, "mode": "brw-rw----", "model": null, "mq": "1", "name": "0xc7d72158e865d8bd4", "opt-io": 0, "owner": "root", "partflags": null, "partlabel": "root", "partn": 4, "parttype": "0fc63daf-8483-4772-8e79-3d69d8477de4", "parttypename": null, "partuuid": "93646aa2-a786-444f-8c04-cd1bcd93f25f", "path": "/dev/mapper/0xc7d72158e865d8bd4", "phy-sec": 512, "pkname": "dm-0", "pttype": null, "ptuuid": null, "ra": 128, "rand": false, "rev": null, "rm": false, "ro": false, "rota": true, "rq-size": null, "sched": null, "serial": null, "size": 10198450688, "start": null, "state": "running", "subsystems": "block", "mountpoint": "/sysroot", "mountpoints": [ "/var", "/sysroot/ostree/deploy/fedora-coreos/var", "/sysroot", "/etc" ], "tran": null, "type": "part", "uuid": "630f6b23-fd3b-4500-ade6-c6beb5bb0d1e", "vendor": null, "wsame": 0, "wwn": null, "zoned": "none", "zone-sz": 0, "zone-wgran": 0, "zone-app": 0, "zone-nr": 0, "zone-omax": 0, "zone-amax": 0 } ] } ] } ```There was a problem hiding this comment.
Or add another function similar to find_parent_devices(device: &str) for example find_parent_disks() to skip the mpath?
See output:
# lsblk --pairs --paths --inverse --output NAME,TYPE /dev/mapper/0xc7d72158e865d8bd
NAME="/dev/mapper/0xc7d72158e865d8bd" TYPE="mpath"
NAME="/dev/sda" TYPE="disk"
NAME="/dev/sdb" TYPE="disk"
There was a problem hiding this comment.
Trying it out myself too indeed looks like I'm wrong, apparently the devmapper backing data doesn't end up getting scraped by lsblk in this case. I guess scraping the /sys/block/<dm>/slaves directory is indeed the best.
There was a problem hiding this comment.
Updated:
Check if it is mpath, then get esp part number from /dev/mapper/0x84deb334f7392fde2 which will strip the prefix device /dev/mapper/0x84deb334f7392fde, and get the none digit. Because sometimes path is /dev/mapper/3600a0980383030397a3f4f4344416334p1 and what we want is 1, not p1.
1902fdb to
bf69564
Compare
bf69564 to
88da73c
Compare
joelcapitao
left a comment
There was a problem hiding this comment.
I'm just wondering about LUKS support in get_esp_partition_number but other than that it LGTM
| } | ||
|
|
||
| /// Get esp partition number from device | ||
| pub fn get_esp_partition_number(device: &str) -> Result<String> { |
There was a problem hiding this comment.
AIUI, we only support Multipath (both friendly and non-friendly names i.e WWID) as device-mapper, so no LVM or encrypted device with LUKS ?
There was a problem hiding this comment.
Not supported yet, currently it will fail with Not supported for <device> if not normal disk or mpath.
|
I will merge this if there is no objection, and perhaps will ask the reporter to do some testing based on the copr build. |
e2fae66 to
113c6ef
Compare
As there is no partition attribute on multipath env, this leads error when install on multipath. Fixes: https://issues.redhat.com/browse/RHEL-81039
113c6ef to
24e327d
Compare
When the sysfs `partition` attribute is missing on multipath devices and lsblk's `partn` field is also unavailable, fall back to extracting trailing digits from the ESP device path (e.g. `/dev/mapper/mpatha2` → `"2"`). This restores the behavior originally added in bootupd PR [bootc-dev#1006](coreos/bootupd#1006) that was lost during the migration to the bootc-internal-blockdev crate. Assisted-by: Claude Code Signed-off-by: ckyrouac <[email protected]>
When the sysfs `partition` attribute is missing on multipath devices and lsblk's `partn` field is also unavailable, fall back to extracting trailing digits from the ESP device path (e.g. `/dev/mapper/mpatha2` → `"2"`). This restores the behavior originally added in bootupd PR [bootc-dev#1006](coreos/bootupd#1006) that was lost during the migration to the bootc-internal-blockdev crate. Assisted-by: Claude Code Signed-off-by: ckyrouac <[email protected]>
efi: Check shim exists and return earlier if not found
efi: Get the correct esp on multipath
Get pointer from Colin:
When I do testing using mpath or backing device, find the
results are the same, so I use mpath instead.
As there is no partition attribute on multipath env, this leads
error when install on multipath.
Fixes: https://issues.redhat.com/browse/RHEL-81039