virtio-mem is a para-virtualized memory device that enables dynamic memory
resizing for virtual machines. Unlike traditional memory hotplug mechanisms,
virtio-mem provides a flexible and efficient solution that works across
different architectures.
The virtio-mem device manages a contiguous memory region that is divided into
fixed-size blocks. The host can request the guest to plug (make available) or
unplug (release) memory by changing the device's target size, and the guest
driver responds by allocating or freeing memory blocks accordingly. This
approach provides fine-grained control over guest memory with minimal overhead.
Firecracker further adds the concept of slots, which are a set of contiguous blocks (usually 128MiB) that can be fully protected from guest accesses to prevent malicious guests from accessing the hotpluggable memory range when not allowed by the host.
To support memory hotplugging via virtio-mem, you must use a guest kernel with
the appropriate version and configuration options enabled as follows:
x86_64: minimal kernel version is 5.16- Earlier versions of the kernel don't support
VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
- Earlier versions of the kernel don't support
aarch64: minimal kernel version is 5.18
For more information about officially supported guest kernels, refer to the kernel policy documentation.
CONFIG_VIRTIO_MEM needs to be enabled in the guest kernel in order to use
virtio-mem.
The virtio-mem device must be configured during VM setup with the total amount
of memory that can be hotplugged, before starting the virtual machine. This can
be done through a PUT request on /hotplug/memory or by including the
configuration in the JSON configuration file. In both cases, when the VM is
started, the hotpluggable region will be completely unplugged.
[!Note] Memory configured through
/hotplug/memoryis a separate pool of memory from the usual "boot memory". Only memory configured through the hotplug endpoint can be plugged or unplugged dynamically.
-
total_size_mib(required): The maximum size of hotpluggable memory in MiB. This defines the upper bound of memory that can be added to the VM. Must be a multiple ofslot_size_mib. -
block_size_mib(optional, default: 2): The size of individual memory blocks in MiB. Must be at least 2 MiB and a power of 2. Larger block sizes provide better performance but less granularity (harder for the guest to unplug). -
slot_size_mib(optional, default: 128): The size of KVM memory slots in MiB. Must be at leastblock_size_miband a power of 2. Larger slot sizes improve performance for large memory operations but reduce unplugging protection efficiency.
It is recommended to leave these values to the default unless strict memory
protection is required, in which case block_size_mib should be equal to
slot_size_mib. Note that this will make it harder for the guest kernel to find
contiguous memory to hot-un-plug. Refer to the
Memory Protection section below for more details.
Here is an example of how to configure the virtio-mem device via the API. In
this example, the hotpluggable memory is configured with a maximum of 1 GiB in
size and default block and slot sizes.
socket_location=/run/firecracker.socket
curl --unix-socket $socket_location -i \
-X PUT 'http://localhost/hotplug/memory' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d "{
\"total_size_mib\": 1024,
\"block_size_mib\": 2,
\"slot_size_mib\": 128
}"[!Note] This is only allowed before the
InstanceStartaction and not on snapshot-restored VMs (which will use the configuration saved in the snapshot).
To configure via JSON, add the following to your VM configuration file. In this example, the hotpluggable memory is configured with a maximum of 1 GiB in size and default block and slot sizes.
{
"memory-hotplug": {
"total_size_mib": 1024,
"block_size_mib": 2,
"slot_size_mib": 128
}
}After configuration, you can query the device status at any time:
socket_location=/run/firecracker.socket
curl --unix-socket $socket_location -i \
-X GET 'http://localhost/hotplug/memory' \
-H 'Accept: application/json'This returns information about the current device state, including:
total_size_mib: Maximum hotpluggable memory sizeblock_size_mib: Block size used by the deviceslot_size_mib: Slot size used by Firecracker (granularity of memory protection)plugged_size_mib: Currently plugged (available) memory by the guestrequested_size_mib: Target memory size set by the host
Once configured and the VM is running, you can dynamically adjust the amount of memory available to the guest by updating the requested size, which is the target that the guest should reach by requesting to plug or unplug memory blocks. The initial value of the requested size is 0 MiB, meaning that no hotpluggable memory blocks are plugged on VM boot.
To add memory to a running VM, request a greater size from the virtio-mem
device:
socket_location=/run/firecracker.socket
curl --unix-socket $socket_location -i \
-X PATCH 'http://localhost/hotplug/memory' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d "{
\"requested_size_mib\": 512
}"Setting a higher requested_size_mib value causes the guest driver to allocate
memory blocks to reach the requested size. The process is asynchronous -- the
guest will incrementally plug memory until it reaches the target. It is
recommended to use the GET API to monitor the current state of the hotplugging
by the driver. The operation is complete when plugged_memory_mib is equal to
requested_memory_mib.
To remove memory from a running VM, request a lower size:
socket_location=/run/firecracker.socket
curl --unix-socket $socket_location -i \
-X PATCH 'http://localhost/hotplug/memory' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d "{
\"requested_size_mib\": 256
}"Setting a lower requested_size_mib value causes the guest driver to free
memory blocks. Once the guest reports a block to be unplugged, the unplugged
memory is immediately freed from the host process. If all blocks in a memory
slot are unplugged, then Firecracker will also protect the memory slot, removing
access from the guest.
To remove all hotplugged memory, set requested_size_mib to 0:
curl --unix-socket $socket_location -i \
-X PATCH 'http://localhost/hotplug/memory' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"requested_size_mib": 0}'[!Note] Unplugging requires the guest to cooperate and actually be able to find and report memory blocks that can be moved or freed by the host. As in the hotplugging case, it is recommended to monitor the operation through the
GETAPI.
The guest kernel must be configured with specific boot or runtime module
parameters to ensure optimal behavior of the virtio-mem driver and memory
hotplug module.
In short:
- pass
memhp_default_state=online_movableif hot-removal is required and there is enough free boot memory for allocating the memory map of the hotplugged memory (64B per 4KiB page). - pass
memory_hotplug.memmap_on_memory=1 memhp_default_state=onlineif hot-removal is not required and the hotpluggable memory area can be much bigger than the normal memory.
This parameter controls how newly hotplugged memory is onlined by the kernel.
This parameter is required for automatically onlining new memory pages. It is
recommended to set it to online_movable as below for reliable memory
hot-removal.
memhp_default_state=online_movable
The online_movable setting ensures that:
- Hotplugged memory is placed in the MOVABLE zone
- The kernel can migrate pages when unplugging is requested
- Memory can be successfully freed back to the host
Other possible values (not recommended for hot-removal):
online: Places memory automatically between NORMAL and MOVABLE zone (may prevent hot-remove)online_kernel: Places memory in NORMAL zone (may prevent hot-remove)offline(default): Memory requires manual onlining
This parameter controls whether the kernel allocates memory map (struct pages)
for hotplugged memory from the hotplugged memory itself, rather than from boot
memory. Without this parameter, the kernel needs 64B for every 4KiB page in the
boot memory. For example, it would need 262 MiB of free "boot" memory to hotplug
16 GiB of memory. This parameter only works if the memory is not entirely
hotplugged as MOVABLE.
memory_hotplug.memmap_on_memory=1 memhp_default_state=online
This configuration is recommended in case hot-removal is not a priority, and the hotpluggable memory area is very large.
For more detailed and up-to-date information about memory hotplug in the Linux kernel, refer to the official kernel documentation: https://docs.kernel.org/admin-guide/mm/memory-hotplug.html
The virtio-mem device is a paravirtualized device requiring cooperation from
a driver in the guest.
Firecracker provides the following guarantees about unplugged memory:
- Memory that is never plugged is protected: Memory that has never been
plugged before is protected from the guest by not making it available to the
guest via a KVM slot and by using
mprotectto prevent access from device emulation. Any attempt by the guest to access unplugged memory will result in a fault and may crash the Firecracker process. - Unplugged memory slots are protected: Memory slots that have been
unplugged are removed from KVM and
mprotect-ed. This requires the guest to report contiguous blocks to be freed for the memory slot to be actually protected. - Unplugged memory blocks are freed: When a memory block is unplugged, the
backing pages are freed, for example using
madvise(MADV_DONTNEED)for anon memory, returning memory to the host at block granularity.
While Firecracker enforces memory isolation at the host level, a compromised guest driver could:
- Fail to plug or unplug memory as requested by the device
- Attempt to access unplugged memory (will result in a fault and crash of Firecracker)
Users should:
- Be prepared to handle cases where the guest doesn't cooperate with memory
operations by monitoring the
GETAPI. - Implement host-level memory limits and monitoring, e.g. through
cgroup.
virtio-mem is compatible with all Firecracker features. Below are some
specific changes in the other features when using memory hotplugging.
Full and diff snapshots will include the unplugged areas as sparse "holes" in the memory snapshot file. Sparse file support is recommended to efficiently handle the memory snapshot files.
The userfaultfd (uffd) handler1 will need to handle the entire
hotpluggable memory range even if unplugged. The uffd handler may decide to
unregister unplugged memory ranges (holes in the memory file). The uffd handler
will also need to handle UFFD_EVENT_REMOVE events for hot-removed blocks,
either unregistering the range or storing the information and returning an empty
page on the next access.
vhost-user2 is fully supported, but Firecracker cannot guarantee
protection of unplugged memory from a vhost-user backend. A malicious guest
driver may be able to trick the backend to access unplugged memory. This is not
possible in Firecracker itself as unplugged memory slots are mprotect-ed.