Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions confidential-containers/attestation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,7 @@ Provision Trustee
Trustee is an open-source framework used in Confidential Containers to verify attestation evidence and conditionally release secrets.
For a full overview of attestation with Trustee, refer to the upstream `Trustee documentation <https://confidentialcontainers.org/docs/attestation/>`_.

To provision a Trustee instance, follow the upstream `Install Trustee in Docker <https://confidentialcontainers.org/docs/attestation/installation/docker/>`_ guide.
This is the recommended install method.
To provision a Trustee instance, follow the recommended upstream `Install Trustee in Docker <https://confidentialcontainers.org/docs/attestation/installation/docker/>`_ guide.

.. note::

Expand All @@ -86,20 +85,21 @@ After you complete installation, Trustee is configured to use the NVIDIA Remote
Configure Workloads for Attestation
====================================

To enable attestation for your workloads, point them to the Trustee network endpoint, sometimes referred to as the Key Broker Service (KBS) endpoint, by adding the following annotation to your workload pod spec:
To enable attestation for your workloads, point them to the Trustee network endpoint, also called the Key Broker Service (KBS) endpoint, by adding the following annotation to your workload pod spec:

.. code-block:: yaml

io.katacontainers.config.hypervisor.kernel_params: "agent.aa_kbc_params=cc_kbc::http://<kbs-ip>:<kbs-port>"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fitzthum do we actually want to propose this? another method to provide this information is via init-data. should we not rather also abstract from this?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with either. This way is simpler; maybe better in the short term. Init-data is an important concept, though. We should cover it at some point.


Replace ``<kbs-ip>`` with the IP address or hostname at which your Trustee instance is reachable from the worker nodes, and ``<kbs-port>`` with the port (default: ``8080``).
Replace ``<kbs-ip>`` with the IP address or hostname at which your Trustee instance is reachable from the worker nodes.
Replace ``<kbs-port>`` with the port that Trustee listens on (default: ``8080``).

Refer to the upstream `Setup Confidential Containers <https://confidentialcontainers.org/docs/attestation/coco-setup/>`_ documentation for more information on configuring workloads for attestation.

.. _customize-attestation:

Customize Attestation Workflows
===============================
Optional: Customize Attestation Workflows
=========================================

After Trustee is provisioned and workloads are configured, you can customize attestation workflows to enforce your desired security policies.
This can include configuring the following:
Expand All @@ -122,6 +122,5 @@ Use the Trustee log to diagnose the attestation process.
Next Steps
==========

* Refer to the :doc:`deployment guide <confidential-containers-deploy>` for Confidential Containers setup instructions.
* Refer to the upstream `Confidential Containers Features <https://confidentialcontainers.org/docs/features>`_ documentation for a complete list of attestation-dependent features.
* Refer to the `NVIDIA Confidential Computing documentation <https://docs.nvidia.com/confidential-computing>`_ for additional information.
* Refer to the upstream `Confidential Containers Features <https://confidentialcontainers.org/docs/features>`_ for complete documentation on attestation features.
* If you haven't already, refer to the :doc:`Confidential Containers deployment guide <confidential-containers-deploy>` to configure your environment for confidential workloads.
83 changes: 40 additions & 43 deletions confidential-containers/confidential-containers-deploy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ Kubernetes Cluster
* ``RuntimeClassInImageCriApi``: Alpha since Kubernetes v1.29 and is not enabled by default.
This feature gate is required to support pod deployments that use multiple snapshotters side-by-side.

Add both feature gates to your Kubelet configuration (typically ``/var/lib/kubelet/config.yaml``):
Add both feature gates to your Kubelet configuration (typically ``sudo vi /var/lib/kubelet/config.yaml``):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so sure. Can we just say 'typically adjust the file ...'. sudo vi is a bit specific and a bit loose. [There may be users which literally wouldn't be able to exit vim anymore]


.. code-block:: yaml

Expand All @@ -180,6 +180,35 @@ Kubernetes Cluster

$ sudo systemctl restart kubelet

.. _configure-image-pull-timeouts:

* Configure image pull timeouts. The guest-pull mechanism pulls images inside the confidential VM, which means large images can take longer to download and delay container start.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking, pulling images inside the guest doesn't necessarily take longer than pulling on the host. The issue is more that guest pull invalidates any kind of caching that would normally happen. Basically you have to pull the image every time you run a pod.

Kubelet can de-allocate your pod if the image pull exceeds the configured timeout before the container transitions to the running state.

If you plan to use large images, increase ``runtimeRequestTimeout`` in your `kubelet configuration <https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/>`_ to ``20m`` to match the default values for the NVIDIA shim configurations in Kata Containers.

Add or update the ``runtimeRequestTimeout`` field in your kubelet configuration (typically ``/var/lib/kubelet/config.yaml``):

.. code-block:: yaml
:emphasize-lines: 3

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
runtimeRequestTimeout: 20m

Restart the kubelet service to apply the change:

.. code-block:: console

$ sudo systemctl restart kubelet

Optionally, you can configure additional timeouts for the NVIDIA Shim and Kata Agent Policy.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of NVIDIA shim -> Kata Shim.

Please do not use Kata Agent Policy here. This timeout is unrelated to the kata agent policy feature. We can just say 'kata agent' here.

The NVIDIA shim configurations in Kata Containers use a default ``create_container_timeout`` of 1200 seconds (20 minutes).
This controls the time the shim allows for a container to remain in container creating state.
If you need a timeout of more than 1200 seconds, you will also need to adjust Kata Agent Policy's ``image_pull_timeout`` value which controls the agent-side timeout for guest-image pull.
To do this, add the ``agent.image_pull_timeout`` kernel parameter to your shim configuration, or pass an explicit value in a pod annotation in the ``io.katacontainers.config.hypervisor.kernel_params: "..."`` annotation.


.. _installation-and-configuration:

Installation
Expand Down Expand Up @@ -461,7 +490,7 @@ For further configuration settings, refer to the following sections:
Run a Sample Workload
=====================

A pod manifest for a confidential container GPU workload requires that you specify the ``kata-qemu-nvidia-gpu-snp`` runtime class for SEV-SNP or ``kata-qemu-nvidia-gpu-tdx`` for TDX.
A pod manifest for a confidential container GPU workload requires that you specify the ``kata-qemu-nvidia-gpu-snp`` runtime class for AMD based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel based systems.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: hyphen in AMD-based? (and intel too)


1. Create a file, such as the following ``cuda-vectoradd-kata.yaml`` sample, specifying the kata-qemu-nvidia-gpu-snp runtime class:

Expand All @@ -474,35 +503,37 @@ A pod manifest for a confidential container GPU workload requires that you speci
name: cuda-vectoradd-kata
namespace: default
spec:
runtimeClassName: kata-qemu-nvidia-gpu-snp
runtimeClassName: kata-qemu-nvidia-gpu-snp # or kata-qemu-nvidia-gpu-tdx
restartPolicy: Never
containers:
- name: cuda-vectoradd
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
resources:
limits:
nvidia.com/pgpu: "1"
nvidia.com/pgpu: "1" # for single GPU passthrough
memory: 16Gi

The following are Confidential Containers configurations in the sample manifest:

* Set the runtime class to ``kata-qemu-nvidia-gpu-snp`` for SEV-SNP or ``kata-qemu-nvidia-gpu-tdx`` for TDX, depending on the node type where the workloads should run.
* Set the runtime class to ``kata-qemu-nvidia-gpu-snp`` for AMD based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel based systems, depending on the node type where the workloads should run.

* In the sample above, ``nvidia.com/pgpu`` is the default resource type for GPUs.
If you are deploying on a heterogeneous cluster, you might want to update the default behavior by specifying the ``P_GPU_ALIAS`` environment variable for the Kata device plugin.
Refer to the :ref:`Configuring GPU or NVSwitch Resource Types Name <coco-configuration-heterogeneous-clusters>` section on this page for more details.

* If you have machines that support multi-GPU passthrough, use a pod deployment manifest that specifies 8 PGPU and 4 NVSwitch resources.
* If you have machines that support multi-GPU passthrough, use a pod deployment manifest that specifies 8 PGPU.
If you are using NVIDIA Hopper GPUs with PPCIE mode, also specify 4 NVSwitch resources.

.. code-block:: yaml

resources:
limits:
nvidia.com/pgpu: "8"
nvidia.com/nvswitch: "4"
nvidia.com/nvswitch: "4" # Only for NVIDIA Hopper GPUs with PPCIE mode

.. note::
If you are using NVIDIA Hopper GPUs for multi-GPU passthrough, also refer to :ref:`Managing the Confidential Computing Mode <managing-confidential-computing-mode>` for details on how to set the ``ppcie`` mode.
If you are using NVIDIA Hopper GPUs for multi-GPU passthrough, you must also set the Confidential Computing mode to ``ppcie`` mode.
Refer to :ref:`Managing the Confidential Computing Mode <managing-confidential-computing-mode>` for details.


2. Create the pod:
Expand Down Expand Up @@ -555,6 +586,7 @@ A pod manifest for a confidential container GPU workload requires that you speci
$ kubectl delete -f cuda-vectoradd-kata.yaml



.. _coco-configuration-settings:

Common GPU Operator Configuration Settings
Expand Down Expand Up @@ -664,7 +696,6 @@ When you change the mode, the manager performs the following actions:

However, the manager does not drain user workloads. You must make sure that no user workloads are running on the node before you change the mode.

* Unbinds the GPU from the VFIO PCI device driver.
* Changes the mode and resets the GPU.
* Reschedules the other GPU Operator operands.

Expand Down Expand Up @@ -807,44 +838,10 @@ Refer to the :ref:`Managing the Confidential Computing Mode <managing-confidenti
The NVIDIA Blackwell architecture uses NVLink encryption which places the switches outside of the Trusted Computing Base (TCB) and only requires the GPU Confidential Computing mode to be set to ``on``.


.. _configure-image-pull-timeouts:

Configure Image Pull Timeouts
=============================

The guest-pull mechanism pulls images inside the confidential VM, which means large images can take longer to download and delay container start.
Kubelet can de-allocate your pod if the image pull exceeds the configured timeout before the container transitions to the running state.

If you plan to use large images, increase ``runtimeRequestTimeout`` in your `kubelet configuration <https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/>`_ to ``20m`` to match the default values for the NVIDIA shim configurations in Kata Containers.

Add or update the ``runtimeRequestTimeout`` field in your kubelet configuration (typically ``/var/lib/kubelet/config.yaml``):

.. code-block:: yaml
:emphasize-lines: 3

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
runtimeRequestTimeout: 20m

Restart the kubelet service to apply the change:

.. code-block:: console

$ sudo systemctl restart kubelet

Additional timeouts to consider updating are the NVIDIA Shim and Kata Agent Policy timeouts.
The NVIDIA shim configurations in Kata Containers use a default ``create_container_timeout`` of 1200 seconds (20 minutes).
This controls the time the shim allows for a container to remain in container creating state.

If you need a timeout of more than 1200 seconds, you will also need to adjust Kata Agent Policy's ``image_pull_timeout`` value which controls the agent-side timeout for guest-image pull.
To do this, add the ``agent.image_pull_timeout`` kernel parameter to your shim configuration, or pass an explicit value in a pod annotation in the ``io.katacontainers.config.hypervisor.kernel_params: "..."`` annotation.


Next Steps
==========

* Refer to the :doc:`Attestation <attestation>` page for more information on configuring attestation.
* To help manage the lifecycle of Kata Containers, install the `Kata Lifecycle Manager <https://github.com/kata-containers/lifecycle-manager>`_.
This Argo Workflows-based tool manages Kata Containers upgrades and day-two operations.
* Refer to the `NVIDIA Confidential Computing documentation <https://docs.nvidia.com/confidential-computing>`_ for additional information.
* Licensing information is available on the :doc:`Licensing <licensing>` page.
Loading