-
Notifications
You must be signed in to change notification settings - Fork 125
Implement Runtime NVMe Instance Storage Discovery Using AWS EBS Symlinks #396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
rkoster
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I would have expected this logic to go into the https://github.com/cloudfoundry/bosh-agent/tree/main/infrastructure/devicepathresolver package.
platform/linux_platform.go
Outdated
| p.logger.Debug(logTag, "Found NVMe devices: %v", allNvmeDevices) | ||
|
|
||
| // Identify EBS volumes via symlinks | ||
| ebsSymlinks, err := p.fs.Glob("/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should somehow be passed in via the agent config in the stemcell builder, because it is IaaS specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a new component instanceStorageResolver for the linux platform. It has 3 variants:
autoDetectingInstanceStorageResolver- the linux platform uses this one, it detects which storage resolver to useawsNVMeInstanceStorageResolver- detects what are the instance storage devices; contains the new logicidentityInstanceStorageResolver- simple resolver for the traditional naming used before this change
Thank you for the review! That's was a big oversight on my end, I'll look into it. |
|
No worries 🙂 |
|
We discussed this during the FI WG meeting and this have to relay on the stemcell agent settings and agent strategy for disc handling. |
* Refactor instance storage discovery into configurable component Implement auto-detection for instance storage disk type * Fix windows tests * Fix windows tests (but for real this time)
|
As discussed during the working group meeting, focus is now on validating: cloudfoundry/bosh-aws-cpi-release#196 (comment) |
Problem
On AWS Nitro-based instances with NVMe devices, the kernel's PCIe enumeration order is non-deterministic. This means:
/dev/nvme0n1could be the root EBS volume OR instance storage/dev/nvme1n1could be instance storage OR the root EBS volumeSolution
Implemented runtime discovery to reliably identify instance storage by excluding EBS volumes.
Discovery Algorithm
Why EBS Symlinks Are Reliable
AWS automatically creates persistent symlinks for all EBS volumes via udev rules:
/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol{volume_id}Backwards Compatibility
Non-NVMe instances: No changes to behavior
/dev/xvdb,/dev/sdb) use CPI paths directlyThis must be merged together with the CPI changes - cloudfoundry/bosh-aws-cpi-release#196
Pair @Ivaylogi98