Add vhost-user support with RNG device implementation#527
Add vhost-user support with RNG device implementation#527dorindabassey wants to merge 5 commits intocontainers:mainfrom
Conversation
6600bb5 to
f9fe527
Compare
mtjhrc
left a comment
There was a problem hiding this comment.
I had a look at the tests, and honestly I don't see much value in them, we don't have any arbitrary coverage percent metric for merging stuff, so I would just remove most of them.
The rest of the code looks good, but I've only had a quick look so far, when this was still a draft, I'll try running this and have another look later.
Having vhost-user support in libkrun seems pretty cool, thanks!
f9fe527 to
7d18fa0
Compare
|
Not really a review, just a comment: @dorindabassey This is some great work and a huge addition to the project. @slp passt uses vhost-user to provide near-native network performance in user mode for QEMU. Perhaps we should consider doing the same here, especially as the main network driver for the v2 API. |
Right, that would be great to have! It's mostly about throughput and latency, but also, perhaps, one day, to implement live migration of TCP connections, as it's only available via vhost-user interface. I'm not sure how much effort networking support would take on top of this pull request, and whether it's beyond its scope or not, but let me share a couple of pointers just in case. The part of API that's implemented by passt is this: https://passt.top/passt/tree/vhost_user.c?id=af7b81b5408da8c56bb22dd11679f2b4024a45c8#n1128 And there's a bit of documentation about how it can be used with QEMU here: https://www.qemu.org/docs/master/system/devices/net.html#using-passt-as-the-user-mode-network-stack Let me know if you have any question! I'll also tag @vivier, the author of the vhost-user implementation in passt, for good measure. |
It's fine for this PR to stand as-is. This can be used as a base for another series adding the network implementation. |
Adding support for virtio-net over vhost doesn't require any changes to any of our networking code. Our All the user would have to do to enable it is:
|
9cfb1ea to
fe4e9ba
Compare
b3bfb2e to
3feb780
Compare
e5c413f to
5b0b670
Compare
5b0b670 to
731cc43
Compare
731cc43 to
ab4d9e5
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces significant new functionality by adding vhost-user frontend support, with an initial implementation for the RNG device. The changes are well-structured across the C FFI, device layer, and VMM builder. A key aspect is the correct implementation of file-backed guest memory using memfd, which is crucial for vhost-user. My review includes suggestions to improve robustness, clarify the API, and clean up the example code.
| .iter() | ||
| .any(|dev| dev.device_type == VIRTIO_ID_RNG); | ||
|
|
||
| if !vm_resources.disable_implicit_rng && !has_vhost_user_rng { |
There was a problem hiding this comment.
The idea I mentioned with krun_disable_implicit_rng was that it would replace this automatic disablement of the internal device. I'm not saying that the automatically disabling of the built-in device is necessarily bad though - in fact it will often be more user friendly.
However, having the user explicitly call krun_disable_implicit_rng has one big advantage though, that this will definitely be consistent with future devices - imagine in the future we could have vhost alternatives for basically every device, but adding more auto detection rules here will not be backwards compatible, potentially causing great confusion with regards to what is in fact enabled.
To be clear, I am not saying this has to necessarily be changed, It's just an observation.
There was a problem hiding this comment.
Yeah, good point about future compatibility. For now I think the auto-detection provides a better user experience, but I'll keep this in mind if we add vhost-user support for other device types in the future.
ab4d9e5 to
d21847a
Compare
Implement vhost-user support for connecting to external virtio device backends running in separate processes. Add vhost-user feature flag, vhost dependency, and krun_add_vhost_user_device() generalized API for adding vhost-user devices. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Add memfd-backed memory region creation to enable memory sharing with vhost-user backends via FD passing. When vhost-user is enabled, all guest RAM regions are created with memfd backing instead of anonymous mmap. This lays the groundwork for vhost-user device support while maintaining backward compatibility such that the VM boots normally with standard memory when vhost-user is not configured. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Implement generic vhost-user device wrapper with connection, feature negotiation, and Guest physical address(GPA) to Virtual address(VA) translation. Supports protocol feature negotiation (CONFIG, MQ) and forwards backend interrupts to guest. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Implements a vhost-user RNG device. The VMM now switches between the standard RNG device and vhost-user RNG depending on whether a socket path is configured via krun_add_vhost_user_device() This allows us to use the RNG device from the rust-vmm vhost-device running in a separate process for better isolation and flexibility. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Adds --vhost-user-rng command line option to specify a vhost-user RNG backend socket path. When provided, the VM uses the external vhost-user RNG device instead of the built-in virtio-rng implementation. Example usage: ./examples/chroot_vm \ --vhost-user-rng=/tmp/vhost-rng.sock0 \ / /bin/sh -c "head -c 32 /dev/hwrng | xxd" Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
d21847a to
aef9b7a
Compare
There was a problem hiding this comment.
I left some further review comments.
After some simple fixes, this actually works great for for virtio-net over vhost using passt.
Here are my the necessary minimum changes to fix this for passt + example:
dorindabassey/libkrun@vhost-user-support...mtjhrc:libkrun:vhost-user-net
The fixes are included in the changes requested to this PR. I'll create a separate PR for the example (and maybe an integration test too).
| const VIRTIO_ID_RNG: u32 = 4; | ||
| const VIRTIO_ID_SND: u32 = 25; | ||
| const VIRTIO_ID_CAN: u32 = 36; | ||
|
|
||
| if device_type != VIRTIO_ID_RNG && device_type != VIRTIO_ID_SND && device_type != VIRTIO_ID_CAN | ||
| { | ||
| return -libc::EINVAL; | ||
| } |
There was a problem hiding this comment.
I think we should remove this check.
Like I mentioned in other comment somewhere, I don't think there is a reason for limiting the device types user can attach. In fact the implementation is already generic enough for many device types and even future devices that don't exist in the spec yet. It is the job of the guest to understand the virtio device(s), if the guest OS doesn't understand the device type it can just ignore the device or crash if there is some mismatch, from the libkrun side we don't really care.
| frontend | ||
| .set_features(backend_features) | ||
| .map_err(|e| io::Error::new(ErrorKind::Other, e))?; |
There was a problem hiding this comment.
Please remove this call here, this directly breaks passt and in general is incorrect.
Configuring the features here is redundant since the features are also configured in activate(). Apart from being redundant the value of backend_features is also incorrect here (if you trace the logic backend_features == VHOST_USER_F_PROTOCOL_FEATURES at this point - the virtio features are stripped).
| // Separate backend-only features from virtio features | ||
| let backend_features = avail_features & VHOST_USER_F_PROTOCOL_FEATURES; | ||
| let our_avail_features = avail_features & !VHOST_USER_F_PROTOCOL_FEATURES; | ||
|
|
||
| // Determine actual queue count - may require protocol feature negotiation | ||
| if backend_features & VHOST_USER_F_PROTOCOL_FEATURES != 0 { |
There was a problem hiding this comment.
VHOST_USER_F_PROTOCOL_FEATURES is not a mask, but a single bit. To illustrate you are actually doing:
if avail_features & VHOST_USER_F_PROTOCOL_FEATURES & VHOST_USER_F_PROTOCOL_FEATURES != 0) {
...Personally I would just alias the variable or make the avail_features mutable and do something like this (but feel free to write it a bit differently, just remove the confusing and redundant bit operation):
| // Separate backend-only features from virtio features | |
| let backend_features = avail_features & VHOST_USER_F_PROTOCOL_FEATURES; | |
| let our_avail_features = avail_features & !VHOST_USER_F_PROTOCOL_FEATURES; | |
| // Determine actual queue count - may require protocol feature negotiation | |
| if backend_features & VHOST_USER_F_PROTOCOL_FEATURES != 0 { | |
| // Strip the vhost specific bit to leave only standard virtio features | |
| let has_backend_features = avail_features & VHOST_USER_F_PROTOCOL_FEATURES != 0; | |
| let avail_features = avail_features & !VHOST_USER_F_PROTOCOL_FEATURES; | |
| if has_backend_features != { |
Then also change the field name of the struct into has_backend_features: bool (you are only ever storing the single bit there anyway).
| let backend_features = avail_features & VHOST_USER_F_PROTOCOL_FEATURES; | ||
| let our_avail_features = avail_features & !VHOST_USER_F_PROTOCOL_FEATURES; | ||
|
|
||
| // Determine actual queue count - may require protocol feature negotiation |
There was a problem hiding this comment.
This comment seems misplaced, maybe it was meant to be above line 140:
``let actual_num_queues = if num_queues == 0 {`
| // NOTE: Do NOT use EFD_NONBLOCK here - the monitoring thread needs to block | ||
| let vring_call_event = EventFd::new(0)?; // Blocking eventfd | ||
|
|
||
| let has_protocol_features = backend_feature_bits & (1 << 30) != 0; |
There was a problem hiding this comment.
the magic hardcoded 1 << 30 constant here is in fact VHOST_USER_F_PROTOCOL_FEATURES.
|
|
||
| // Connect to the vhost-user backend | ||
| let stream = UnixStream::connect(socket_path)?; | ||
| let mut frontend = Frontend::from_stream(stream, 1); |
There was a problem hiding this comment.
You have hardcoded only 1 queue here. If you attempt to use attempt to configure a device with more than 1 queue (and without MQ feature!), the configuration methods would fail.
You can simply do this:
| let mut frontend = Frontend::from_stream(stream, 1); | |
| // NOTE: `num_queues` could be 0 here, but this is actually fine | |
| // because if `VhostUserProtocolFeatures::MQ` is supported the negotioted | |
| // value will be used automatically by Frontend | |
| let mut frontend = Frontend::from_stream(stream, num_queues as u64); |
Crazy stuff, thanks to both of you! @vivier have you seen this? We should iperf3 it soon. I'll try to find some time to play with this within a couple of weeks / months (surely too late for any concrete review / practical support here though). |
Completed brainstorming session. Design includes: - VhostUserFs device wrapping generic VhostUserDevice (PR containers#527) - DAX window via FUSE_SETUPMAPPING/REMOVEMAPPING (no custom vhost-user msgs) - Per-file DAX via FUSE_ATTR_DAX - Custom DEVICE_STATE protocol for snapshot/restore of daemon state - Minimal test daemon with synthetic in-memory filesystem - 8 implementation phases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…of C API and RNG This commit integrates PR containers#527 infrastructure by cherry-picking three commits: - 1900376: Feature flags, Cargo.toml changes, C API, module setup - 635abe7: Memfd-backed guest memory in builder.rs - e5510dc: Core VhostUserDevice implementation Included: - VhostUserDevice generic wrapper (~450 lines) for vhost-user socket connection, feature negotiation, memory sharing, vring setup, and interrupt forwarding - Memfd-backed guest RAM enabling fd-passing to vhost-user daemons - vhost-user feature flag in devices, vmm, and libkrun crates - VhostUserDeviceConfig struct for generic device configuration Stripped (not needed for Phase 1): - krun_add_vhost_user_device() C API function - krun_disable_implicit_rng() C API function - disable_implicit_rng field from VmResources - All header file additions - RNG device specialization logic The vhost-user infrastructure is now ready for the FS device specialization in subsequent phases.
This PR adds vhost-user frontend support to libkrun, enabling virtio devices to run in separate processes using rust-vmm's vhost-device backends for improved isolation and flexibility. The RNG frontend is implemented as the initial use case, and is designed to easily support additional devices (sound, GPU, can, etc).