-
Notifications
You must be signed in to change notification settings - Fork 39
Minor nits in coco docs #394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -162,7 +162,7 @@ Kubernetes Cluster | |
| * ``RuntimeClassInImageCriApi``: Alpha since Kubernetes v1.29 and is not enabled by default. | ||
| This feature gate is required to support pod deployments that use multiple snapshotters side-by-side. | ||
|
|
||
| Add both feature gates to your Kubelet configuration (typically ``/var/lib/kubelet/config.yaml``): | ||
| Add both feature gates to your Kubelet configuration (typically ``sudo vi /var/lib/kubelet/config.yaml``): | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not so sure. Can we just say 'typically adjust the file ...'. |
||
|
|
||
| .. code-block:: yaml | ||
|
|
||
|
|
@@ -180,6 +180,35 @@ Kubernetes Cluster | |
|
|
||
| $ sudo systemctl restart kubelet | ||
|
|
||
| .. _configure-image-pull-timeouts: | ||
|
|
||
| * Configure image pull timeouts. The guest-pull mechanism pulls images inside the confidential VM, which means large images can take longer to download and delay container start. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Technically speaking, pulling images inside the guest doesn't necessarily take longer than pulling on the host. The issue is more that guest pull invalidates any kind of caching that would normally happen. Basically you have to pull the image every time you run a pod. |
||
| Kubelet can de-allocate your pod if the image pull exceeds the configured timeout before the container transitions to the running state. | ||
|
|
||
| If you plan to use large images, increase ``runtimeRequestTimeout`` in your `kubelet configuration <https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/>`_ to ``20m`` to match the default values for the NVIDIA shim configurations in Kata Containers. | ||
|
|
||
| Add or update the ``runtimeRequestTimeout`` field in your kubelet configuration (typically ``/var/lib/kubelet/config.yaml``): | ||
|
|
||
| .. code-block:: yaml | ||
| :emphasize-lines: 3 | ||
|
|
||
| apiVersion: kubelet.config.k8s.io/v1beta1 | ||
| kind: KubeletConfiguration | ||
| runtimeRequestTimeout: 20m | ||
|
|
||
| Restart the kubelet service to apply the change: | ||
|
|
||
| .. code-block:: console | ||
|
|
||
| $ sudo systemctl restart kubelet | ||
|
|
||
| Optionally, you can configure additional timeouts for the NVIDIA Shim and Kata Agent Policy. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of NVIDIA shim -> Kata Shim. Please do not use Kata Agent Policy here. This timeout is unrelated to the kata agent policy feature. We can just say 'kata agent' here. |
||
| The NVIDIA shim configurations in Kata Containers use a default ``create_container_timeout`` of 1200 seconds (20 minutes). | ||
| This controls the time the shim allows for a container to remain in container creating state. | ||
| If you need a timeout of more than 1200 seconds, you will also need to adjust Kata Agent Policy's ``image_pull_timeout`` value which controls the agent-side timeout for guest-image pull. | ||
| To do this, add the ``agent.image_pull_timeout`` kernel parameter to your shim configuration, or pass an explicit value in a pod annotation in the ``io.katacontainers.config.hypervisor.kernel_params: "..."`` annotation. | ||
|
|
||
|
|
||
| .. _installation-and-configuration: | ||
|
|
||
| Installation | ||
|
|
@@ -461,7 +490,7 @@ For further configuration settings, refer to the following sections: | |
| Run a Sample Workload | ||
| ===================== | ||
|
|
||
| A pod manifest for a confidential container GPU workload requires that you specify the ``kata-qemu-nvidia-gpu-snp`` runtime class for SEV-SNP or ``kata-qemu-nvidia-gpu-tdx`` for TDX. | ||
| A pod manifest for a confidential container GPU workload requires that you specify the ``kata-qemu-nvidia-gpu-snp`` runtime class for AMD based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel based systems. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: hyphen in |
||
|
|
||
| 1. Create a file, such as the following ``cuda-vectoradd-kata.yaml`` sample, specifying the kata-qemu-nvidia-gpu-snp runtime class: | ||
|
|
||
|
|
@@ -474,35 +503,37 @@ A pod manifest for a confidential container GPU workload requires that you speci | |
| name: cuda-vectoradd-kata | ||
| namespace: default | ||
| spec: | ||
| runtimeClassName: kata-qemu-nvidia-gpu-snp | ||
| runtimeClassName: kata-qemu-nvidia-gpu-snp # or kata-qemu-nvidia-gpu-tdx | ||
| restartPolicy: Never | ||
| containers: | ||
| - name: cuda-vectoradd | ||
| image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" | ||
| resources: | ||
| limits: | ||
| nvidia.com/pgpu: "1" | ||
| nvidia.com/pgpu: "1" # for single GPU passthrough | ||
| memory: 16Gi | ||
|
|
||
| The following are Confidential Containers configurations in the sample manifest: | ||
|
|
||
| * Set the runtime class to ``kata-qemu-nvidia-gpu-snp`` for SEV-SNP or ``kata-qemu-nvidia-gpu-tdx`` for TDX, depending on the node type where the workloads should run. | ||
| * Set the runtime class to ``kata-qemu-nvidia-gpu-snp`` for AMD based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel based systems, depending on the node type where the workloads should run. | ||
|
|
||
| * In the sample above, ``nvidia.com/pgpu`` is the default resource type for GPUs. | ||
| If you are deploying on a heterogeneous cluster, you might want to update the default behavior by specifying the ``P_GPU_ALIAS`` environment variable for the Kata device plugin. | ||
| Refer to the :ref:`Configuring GPU or NVSwitch Resource Types Name <coco-configuration-heterogeneous-clusters>` section on this page for more details. | ||
|
|
||
| * If you have machines that support multi-GPU passthrough, use a pod deployment manifest that specifies 8 PGPU and 4 NVSwitch resources. | ||
| * If you have machines that support multi-GPU passthrough, use a pod deployment manifest that specifies 8 PGPU. | ||
| If you are using NVIDIA Hopper GPUs with PPCIE mode, also specify 4 NVSwitch resources. | ||
|
|
||
| .. code-block:: yaml | ||
|
|
||
| resources: | ||
| limits: | ||
| nvidia.com/pgpu: "8" | ||
| nvidia.com/nvswitch: "4" | ||
| nvidia.com/nvswitch: "4" # Only for NVIDIA Hopper GPUs with PPCIE mode | ||
|
|
||
| .. note:: | ||
| If you are using NVIDIA Hopper GPUs for multi-GPU passthrough, also refer to :ref:`Managing the Confidential Computing Mode <managing-confidential-computing-mode>` for details on how to set the ``ppcie`` mode. | ||
| If you are using NVIDIA Hopper GPUs for multi-GPU passthrough, you must also set the Confidential Computing mode to ``ppcie`` mode. | ||
| Refer to :ref:`Managing the Confidential Computing Mode <managing-confidential-computing-mode>` for details. | ||
|
|
||
|
|
||
| 2. Create the pod: | ||
|
|
@@ -555,6 +586,7 @@ A pod manifest for a confidential container GPU workload requires that you speci | |
| $ kubectl delete -f cuda-vectoradd-kata.yaml | ||
|
|
||
|
|
||
|
|
||
| .. _coco-configuration-settings: | ||
|
|
||
| Common GPU Operator Configuration Settings | ||
|
|
@@ -664,7 +696,6 @@ When you change the mode, the manager performs the following actions: | |
|
|
||
| However, the manager does not drain user workloads. You must make sure that no user workloads are running on the node before you change the mode. | ||
|
|
||
| * Unbinds the GPU from the VFIO PCI device driver. | ||
| * Changes the mode and resets the GPU. | ||
| * Reschedules the other GPU Operator operands. | ||
|
|
||
|
|
@@ -807,44 +838,10 @@ Refer to the :ref:`Managing the Confidential Computing Mode <managing-confidenti | |
| The NVIDIA Blackwell architecture uses NVLink encryption which places the switches outside of the Trusted Computing Base (TCB) and only requires the GPU Confidential Computing mode to be set to ``on``. | ||
|
|
||
|
|
||
| .. _configure-image-pull-timeouts: | ||
|
|
||
| Configure Image Pull Timeouts | ||
| ============================= | ||
|
|
||
| The guest-pull mechanism pulls images inside the confidential VM, which means large images can take longer to download and delay container start. | ||
| Kubelet can de-allocate your pod if the image pull exceeds the configured timeout before the container transitions to the running state. | ||
|
|
||
| If you plan to use large images, increase ``runtimeRequestTimeout`` in your `kubelet configuration <https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/>`_ to ``20m`` to match the default values for the NVIDIA shim configurations in Kata Containers. | ||
|
|
||
| Add or update the ``runtimeRequestTimeout`` field in your kubelet configuration (typically ``/var/lib/kubelet/config.yaml``): | ||
|
|
||
| .. code-block:: yaml | ||
| :emphasize-lines: 3 | ||
|
|
||
| apiVersion: kubelet.config.k8s.io/v1beta1 | ||
| kind: KubeletConfiguration | ||
| runtimeRequestTimeout: 20m | ||
|
|
||
| Restart the kubelet service to apply the change: | ||
|
|
||
| .. code-block:: console | ||
|
|
||
| $ sudo systemctl restart kubelet | ||
|
|
||
| Additional timeouts to consider updating are the NVIDIA Shim and Kata Agent Policy timeouts. | ||
| The NVIDIA shim configurations in Kata Containers use a default ``create_container_timeout`` of 1200 seconds (20 minutes). | ||
| This controls the time the shim allows for a container to remain in container creating state. | ||
|
|
||
| If you need a timeout of more than 1200 seconds, you will also need to adjust Kata Agent Policy's ``image_pull_timeout`` value which controls the agent-side timeout for guest-image pull. | ||
| To do this, add the ``agent.image_pull_timeout`` kernel parameter to your shim configuration, or pass an explicit value in a pod annotation in the ``io.katacontainers.config.hypervisor.kernel_params: "..."`` annotation. | ||
|
|
||
|
|
||
| Next Steps | ||
| ========== | ||
|
|
||
| * Refer to the :doc:`Attestation <attestation>` page for more information on configuring attestation. | ||
| * To help manage the lifecycle of Kata Containers, install the `Kata Lifecycle Manager <https://github.com/kata-containers/lifecycle-manager>`_. | ||
| This Argo Workflows-based tool manages Kata Containers upgrades and day-two operations. | ||
| * Refer to the `NVIDIA Confidential Computing documentation <https://docs.nvidia.com/confidential-computing>`_ for additional information. | ||
| * Licensing information is available on the :doc:`Licensing <licensing>` page. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fitzthum do we actually want to propose this? another method to provide this information is via init-data. should we not rather also abstract from this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with either. This way is simpler; maybe better in the short term. Init-data is an important concept, though. We should cover it at some point.