Skip to content

Conversation

@ckyrouac
Copy link
Collaborator

This was a lot of back and forth with claude. Pushing this for some feedback.


Prep for supporting installing to devices with multiple parents (e.g. RAID, LVM), see #1911. The core change is refactoring bootc-blockdev to use lsblk instead of sfdisk to resolve the device tree. See the individual commits for more details.

Add two new fields to the Device struct that are already provided by
lsblk's JSON output:

- `partn`: Partition number (1-indexed), None for whole disk devices
- `pttype`: Partition table type (e.g., "gpt", "dos")

These fields will be used in subsequent commits to replace the
sfdisk-based PartitionTable API with a unified lsblk-based Device API.

Assisted-by: Claude Code (Opus 4.5)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Add new methods to the Device struct that provide the same functionality
as the PartitionTable API but work directly on lsblk's Device structure:

- node(): Alias for path() for compatibility
- find_device_by_partno(partno): Find a child partition by number
- find_partition_of_type(uuid): Find a child partition by type GUID
- find_partition_of_esp(): Find the EFI System Partition

These methods will replace the corresponding PartitionTable methods
in subsequent commits.

Assisted-by: Claude Code (Opus 4.5)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Add functionality to look up parent devices and list devices by directory:

- Add `bootc-mount` and `cap-std-ext` dependencies for filesystem inspection
- Add `parents: Option<Vec<Device>>` field (with `#[serde(skip)]`) to Device
  to store parent device hierarchy from lsblk --inverse
- Add `list_parents()` private function to query parent devices
- Modify `list_dev()` to populate the parents field
- Add `list_dev_by_dir()` public function to find the device containing
  a filesystem mounted at a given directory

This provides a replacement for the `find_parent_devices()` function
and enables looking up block devices from mounted directories.

Assisted-by: Claude Code (Opus 4.5)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Add /run/udev to the list of host mounts that are mirrored during
install-to-disk. This is required for lsblk to properly query device
information when running inside the installation container.

Without access to udev state, lsblk may fail to return complete device
information needed for partition detection and device lookup.

See: bootc-dev#688

Assisted-by: Claude Code (Opus 4.5)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Migrate all consumers of the sfdisk-based PartitionTable API to use the
new lsblk-based Device API:

- Change RootSetup.device_info type from PartitionTable to Device
- Update partition type checking to use pttype.as_deref() instead of
  the PartitionType enum
- Replace partitions_of() calls with list_dev()
- Replace find_partno() with find_device_by_partno()
- Replace .node access with .path()
- Remove esp_in() function from bootloader.rs
- Remove get_esp_partition_node() from bootloader.rs
- Remove get_sysroot_parent_dev() and get_esp_partition() from boot.rs
- Update ESP finding to use Device.find_partition_of_esp()
- Update parent device lookup to use list_dev_by_dir()

The Device API provides a unified way to query block device information
using lsblk, which is more reliable than sfdisk for partition discovery
and works consistently across different device types.

Assisted-by: Claude Code (Opus 4.5)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
The parent device traversal in install.rs exits early when the root
filesystem is on LVM on a partition (e.g., /dev/BL/root02 → /dev/loop0p4
→ /dev/loop0).

The issue is that list_parents() uses lsblk --inverse which returns a
nested structure where parent devices have their own parents in their
"children" field (inverse terminology). We were only extracting the
immediate parents without converting their children to parents.

Fix by adding convert_inverse_children_to_parents() which recursively
walks the inverse device tree and converts each device's children field
to a parents field, enabling full traversal up the device hierarchy.

Assisted-by: Claude Code (Opus 4.5)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Rewrite list_dev() to build a complete device tree where every node
has both parents and children populated. This is done with two lsblk
calls:

1. Inverse lsblk on target device to find the root (top of hierarchy)
2. Regular lsblk on root to get the full tree with all children

Then we populate parent links by walking the tree from root down,
and return the target device from the fully-populated tree.

This fixes the install-to-filesystem test where the root device
(e.g., /dev/loop0) needs its children (partitions) populated for
bwrap to bind them during bootloader installation.

Assisted-by: Claude Code (Opus 4.5)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Remove eager parent population from list_dev(), which previously made
two lsblk calls (inverse + regular) and built a complete tree with
parent links. The parents field stored stripped clones (no
children/parents), so traversing grandparents from any intermediate
node would yield None.

Instead, simplify list_dev() to a single `lsblk -J -b -O <dev>` call
plus backfill_missing(). Parents are now resolved on demand via
Device::list_parents(), which calls `lsblk --inverse` and returns the
parent chain through nested children fields. Device::list_children()
handles the case where a Device obtained from an inverse tree lacks
children by querying lsblk on demand.

This removes the helper functions populate_parents(), find_root_path(),
find_root_device(), and find_device_in_tree(). The parents field is
retained for now but no longer populated by list_dev().

Assisted-by: Claude Code (Opus 4)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Replace the loop over dev.parents.take() with the new
Device::list_parents() API. The inverse lsblk tree is walked to the
root, with multi-parent detection preserved.

Once the root device is found, refresh it via lsblk to populate its
actual children (partitions). Devices from the inverse tree only carry
parent links, not real children, so without this the bwrap container
would lack partition device bind mounts.

With no remaining consumers of the parents field, remove it from the
Device struct.

Assisted-by: Claude Code (Opus 4)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
The PARTN column was added in util-linux 2.39, but CentOS 9 / RHEL 9
ship util-linux 2.37. When lsblk -O is used on these systems, the
partn field is absent from JSON output, causing find_device_by_partno()
to fail with "Missing partition for index N".

Add backfill_partn() alongside the existing backfill_start() to read
the partition number from /sys/dev/block/{maj:min}/partition when lsblk
doesn't provide it.

Assisted-by: Claude Code (Opus 4)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Add a root_disk() method that walks the parent chain via lsblk
--inverse to find the root (whole disk) device, then refreshes it
to populate its actual children (partitions).

This is needed by callers that start from a partition device (e.g.
from list_dev_by_dir on /sysroot) and need to find sibling
partitions such as the ESP.

Assisted-by: Claude Code (Opus 4)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
@github-actions github-actions bot added the area/install Issues related to `bootc install` label Feb 11, 2026
@bootc-bot bootc-bot bot requested a review from henrywang February 11, 2026 16:09
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the bootc-blockdev crate to use lsblk for device tree resolution instead of sfdisk. This is a significant improvement that prepares the codebase for supporting complex storage layouts like RAID and LVM. The changes include a more robust Device structure with lazy child loading and parent chain traversal. I have provided feedback on the root_disk logic which currently remains restrictive regarding multiple parents, and suggested a minor improvement for sysfs property parsing.

Comment on lines +1601 to +1603
// udev state is required for running lsblk during install to-disk
// see https://github.com/bootc-dev/bootc/pull/688
bootc_mount::ensure_mirrored_host_mount("/run/udev")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Mounting /run/udev is a critical addition for the reliability of lsblk when running inside a container. This ensures that the container has access to the host's udev database, which is required for lsblk to correctly identify device properties and relationships.

Comment on lines +199 to +203
anyhow::ensure!(
current.len() == 1,
"Device {} has multiple parents; cannot determine root disk",
self.path()
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of root_disk explicitly bails if a device has multiple parents. While the PR description mentions this is preparation for RAID/LVM support, this specific check will prevent RAID1 or multipath devices from being resolved to a root disk. If the intent is to support these in the future, consider relaxing this or providing a more descriptive error message indicating that RAID/multipath is not yet supported for root disk determination.

All callers that need the whole-disk device (e.g. to find the ESP
partition) now use root_disk() to traverse from the filesystem
partition to the root disk.

This fixes list_dev_by_dir() consumers (store, bootloader, composefs
boot) that were searching for ESP partitions on a partition device
rather than the whole disk, and simplifies install.rs which had an
open-coded parent walk.

Assisted-by: Claude Code (Opus 4)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this has the potential for a big cleanup. I have some nits but also some larger structural questions around the end state.

.ok_or_else(|| anyhow!("Missing partition for index {partno}"))
}

/// Find a child partition by partition type GUID.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for MBR it's a partition type. I'd just s/ GUID// - or to be more accurate we could explain here that GPT is GUID, MBR is partition type code.

first_child.partuuid.as_deref().unwrap(),
"3979e399-262f-4666-aabc-7ab5d3add2f0"
);
// Verify find_device_by_partno works
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sane, but do we have a test JSON that's MBR? Would be good to add here if not

Comment on lines 182 to 185
let target = output.blockdevices.first_mut();

match target {
Some(target) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style but how about:

let r = match output.blockdevices.as_slice() {
  [target, ...] => { ...; parents },
  [] => Vec::new();
}
Ok(r)

// from: https://github.com/systemd/systemd/blob/26b2085d54ebbfca8637362eafcb4a8e3faf832f/man/systemd-boot.xml#L392
const SYSTEMD_KEY_DIR: &str = "loader/keys";

#[allow(dead_code)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 nice to see this duplicated code go away

Comment on lines +125 to +127
// Collect partition paths first so they live long enough
let partition_paths: Vec<String> =
device.children.iter().flatten().map(|p| p.path()).collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a Vec<&str> i think if we just moved this call up closer to the function entrypoint

}

/// Find a device in the tree by its path.
fn find_device_in_tree(root: &Device, target_path: &Utf8Path) -> Option<Device> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one looks trivial to generate some unit tests for too

Comment on lines +123 to +124
// The "partn" column was added in util-linux 2.39, which is newer than
// what CentOS 9 / RHEL 9 ship (2.37).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhh yes I recall getting burned by a variant of this too 😢

util-linux basically doesn't rebase by default in RHEL so at this point in RHEL9 it's quite old.

But they will patch things if asked, so we could ask...

/// Returns the root device with its children (partitions) populated.
/// If this device is already a root device, returns a clone of `self`.
/// Fails if the device has multiple parents at any level.
pub fn root_disk(&self) -> Result<Device> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this function is unused in this commit, which seems surprising?

Probably could also have unit tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to unit tests

yea I need to clean up the commits some more. I tried to have claude help but it was a little aggressive on how many commits it created. For reference this function is used in future commits, e.g.

https://github.com/bootc-dev/bootc/pull/2000/changes#diff-1604878878a88c4dcbd7f51444b69fc01a58e3ab48ade4e10b7a9ed05fa867f8R536

https://github.com/bootc-dev/bootc/pull/2000/changes#diff-1604878878a88c4dcbd7f51444b69fc01a58e3ab48ade4e10b7a9ed05fa867f8R1068


// Locate ESP partition device
let root_dev = bootc_blockdev::list_dev_by_dir(&storage.physical_root)?;
let root_dev = bootc_blockdev::list_dev_by_dir(&storage.physical_root)?.root_disk()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Unrelated to this line of code but I'd argue to squash this with previous)

Comment on lines 48 to 49
let dev = bootc_blockdev::list_dev_by_dir(physical_root)?.root_disk()?;
if let Some(esp_dev) = dev.find_partition_of_type(bootc_blockdev::ESP) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the end state, don't we want to handle multiple devices? it seems surprising after this series that we end up with calling a fallible .root_disk() in situations like this.

I think this particular case really wants to use a shared find_esp_from_filesystem() or so right? (Also I note here this is using bootc_blockdev::ESP but we need to handle MBR too so we really do need to have a find_esp function which handles either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

root_disk() is sort of a stop-gap function. My current plan for multi device support is to gradually add support depending on the bootloader. So the followup commit to this will add multi device support for uefi and bios installations, but systemd-boot and zipl will continue to only support a single root disk.

@cgwalters
Copy link
Collaborator

Backing up at a high level what I think would make sense for a series is:

  1. Pure bugfixes/cleanups like the "add missing ptnum for rhel9"
  2. Consolidating a shared "find_esp_from_filesystem" and "find_esp_in_device" functions
  3. Rewriting blockdev.rs (adding new APIs where needed), port things to use the new API?

That said it's possible that the 2. and 3. are too entangled.

@ckyrouac
Copy link
Collaborator Author

That said it's possible that the 2. and 3. are too entangled.

They definitely felt entangled as I was iterating on this but I should be able to mostly isolate them.

@ckyrouac ckyrouac mentioned this pull request Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/install Issues related to `bootc install`

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants