Add GPU support to kubernetes_scale on EKS Karpenter #6313

kiryl-filatau · 2025-12-18T16:46:41Z

Summary

Add AWS (EKS) support to kubernetes_scale.

Description

Enable running the benchmark on AWS EKS, including GPU support with Karpenter by creating a dedicated GPU NodePool during Prepare().

Command used to run the benchmark

./pkb.py \
  --cloud=AWS \
  --benchmarks=kubernetes_scale \
  --config_override=kubernetes_scale.container_cluster.vm_spec.AWS.zone="us-east-1a" \
  --config_override=kubernetes_scale.container_cluster.type="Karpenter" \
  --config_override=kubernetes_scale.container_cluster.max_vm_count=4 \
  --config_override=kubernetes_scale.container_cluster.vm_spec.AWS.machine_type="g5.2xlarge" \
  --gpu_count=1 \
  --gpu_type=l4 \
  --kubernetes_scale_num_replicas=2 \
  --kubernetes_scale_pod_cpus=4 \
  --kubernetes_scale_pod_memory=4G \
  --kubernetes_scale_report_individual_latencies=True \
  --kubernetes_scale_report_latency_percentiles=False \
  --metadata=cloud:AWS \
  --timeout_minutes=236

rsgowman · 2025-12-18T21:55:54Z

perfkitbenchmarker/data/container/kubernetes_scale/kubernetes_scale.yaml.j2

        name: {{ Name }}
    spec:
+      {%- if NvidiaGpuRequest and Cloud == 'aws' %}
+      nodeSelector:


Zach recently changed how the nodeSelectors area applied. Take a look at #6291, and in particular, the (new) GetNodeSelectors() methods in the providers. (e.g. within providers/aws/elastic_kubernetes_service.py).

That said, your nodeSelector here refers back to your aws-gpu-nodepool.yaml.j2 file, so I think the new GetNodeSelectors() method probably doesn't apply - keeping this in the yaml.j2 file seems like the right approach here.

No action required (though take a quick look at that #6291.)

Thanks for the pointer, I checked #6291.
This does not apply in our case, so you are right, no changes are needed here. Thanks.

Hmm, I disagree here. A few reasons:

GetNodeSelectors & ModifyPodSpecPlacementYaml + the associated ConvertManifestToYamlDicts -> ModifyPodSpecPlacementYaml -> ApplyYaml flow is basically doing the same thing & the code you've written can be written in this manner. It's very inconsistent & makes for an odd exception for specifically Karpenter to use the old ApplyManifest path instead. To some extent this is just a "my new way vs your old way" & I suspect the old way make more sense / you may have already gotten some of this code working via the old way before syncing & realizing the old way existed.. but now that the new way does exist we shouldn't mix & match between the two.

Now how would this be implemented here & why could an exception make sense? Rich notes that the GetNodeSelectors function here refers to the aws-gpu-nodepool.yaml.j2 nodepool , which is only applied in this benchmark.. hence putting this code in the resource would be tricky.

However, it doesn't actually look like this code is benchmark specific and should instead be resource specific. See my comments about aws-gpu-nodepool.yaml.j2 looking the same as in kubernetes ai inference.

So all of this should instead just go into the EksKarpenterCluster class & its GetNodeSelectors + ModifyPodSpecPlacementYaml code.

Thanks for the feedback! I've refactored the code to use GetNodeSelectors()
and ModifyPodSpecPlacementYaml() consistently, removing the hardcoded
nodeSelector from the Jinja2 template. The node selector logic is now
centralized in EksKarpenterCluster.GetNodeSelectors() as suggested,
aligning with the pattern from PR #6291.

rsgowman · 2025-12-18T21:58:31Z

perfkitbenchmarker/data/container/kubernetes_scale/kubernetes_scale.yaml.j2

        operator: "Exists"
        effect: "NoExecute"
        tolerationSeconds: {{ PodTimeout }}
+      {%- if NvidiaGpuRequest and Cloud == 'aws' %}


Similarly, that same PR introduced ModifyPodSpecPlacyementYaml method, which seems like it could apply here. But again, this block refers back to your nodepool yaml.j2 file, so I think you should probably leave it as is.

No action required.

Thanks for the pointer and the context.

As per other reply, this going in ModifyPodSpecPlacementYaml / in code would be a good solution. Right now I think there's also an error - currently this will apply the new nvidia.com/gpu key to all EKS clusters, including both EKS Auto & EKS Karpenter. Not sure if that will work or cause breakages for EKS Auto / I don't think this is intended.

Two possible solutions:

Refactor to make this modification entirely in python code + yaml_docs rather than yaml. This might also work well with a possible follow-up refactor to avoid duplicating the file.

Following what you're doing with if EksKarpenter: manifest_kwargs['GpuTaintKey'] = ... only check here {%- if GpuTaintKey %} & start off initializing that to None & only give it a value if EksKarpenter.

Thanks for catching this
I've fixed it by changing the Jinja2 condition
to {%- if GpuTaintKey %} and initializing GpuTaintKey=None in
manifest_kwargs, only setting it to 'nvidia.com/gpu' for EKS Karpenter.
Now the toleration is only added for Karpenter clusters, not EKS Auto.

rsgowman · 2025-12-18T22:13:03Z

perfkitbenchmarker/linux_benchmarks/kubernetes_scale_benchmark.py

+  )
+
+  if is_eks_karpenter_aws_gpu:
+    cluster.ApplyManifest(


NB: Because ScaleUpPods is called twice, you'll end up applying this nodepool manifest twice. Which is fine - the second time k8s will notice it's already present and do nothing.

But perhaps we should put this into the Prepare() function instead? (And if that works, maybe we should put the first invocation of ScaleUpPods there too... though if you want to do that, then a separate PR might be better.)

@rsgowman please take a look, I have added 2 functions and move the nodepool creation to Prepare()

perfkitbenchmarker/data/container/kubernetes_scale/aws-gpu-nodepool.yaml.j2

hubatish · 2026-01-06T18:26:54Z

perfkitbenchmarker/data/container/kubernetes_scale/kubernetes_scale.yaml.j2

        name: {{ Name }}
    spec:
+      {%- if NvidiaGpuRequest and Cloud == 'aws' %}
+      nodeSelector:


Hmm, I disagree here. A few reasons:

GetNodeSelectors & ModifyPodSpecPlacementYaml + the associated ConvertManifestToYamlDicts -> ModifyPodSpecPlacementYaml -> ApplyYaml flow is basically doing the same thing & the code you've written can be written in this manner. It's very inconsistent & makes for an odd exception for specifically Karpenter to use the old ApplyManifest path instead. To some extent this is just a "my new way vs your old way" & I suspect the old way make more sense / you may have already gotten some of this code working via the old way before syncing & realizing the old way existed.. but now that the new way does exist we shouldn't mix & match between the two.

Now how would this be implemented here & why could an exception make sense? Rich notes that the GetNodeSelectors function here refers to the aws-gpu-nodepool.yaml.j2 nodepool , which is only applied in this benchmark.. hence putting this code in the resource would be tricky.

However, it doesn't actually look like this code is benchmark specific and should instead be resource specific. See my comments about aws-gpu-nodepool.yaml.j2 looking the same as in kubernetes ai inference.

So all of this should instead just go into the EksKarpenterCluster class & its GetNodeSelectors + ModifyPodSpecPlacementYaml code.

perfkitbenchmarker/data/container/kubernetes_scale/aws-gpu-nodepool.yaml.j2

hubatish · 2026-01-15T17:04:12Z

perfkitbenchmarker/data/container/kubernetes_scale/kubernetes_scale.yaml.j2

        operator: "Exists"
        effect: "NoExecute"
        tolerationSeconds: {{ PodTimeout }}
+      {%- if NvidiaGpuRequest and Cloud == 'aws' %}


As per other reply, this going in ModifyPodSpecPlacementYaml / in code would be a good solution. Right now I think there's also an error - currently this will apply the new nvidia.com/gpu key to all EKS clusters, including both EKS Auto & EKS Karpenter. Not sure if that will work or cause breakages for EKS Auto / I don't think this is intended.

Two possible solutions:

Refactor to make this modification entirely in python code + yaml_docs rather than yaml. This might also work well with a possible follow-up refactor to avoid duplicating the file.

Following what you're doing with if EksKarpenter: manifest_kwargs['GpuTaintKey'] = ... only check here {%- if GpuTaintKey %} & start off initializing that to None & only give it a value if EksKarpenter.

hubatish · 2026-01-20T18:02:01Z

With caveats of follow-ups this PR lgtm.

hubatish · 2026-01-20T18:06:47Z

perfkitbenchmarker/linux_benchmarks/kubernetes_scale_benchmark.py

      RolloutTimeout=max_wait_time,
      PodTimeout=resource_timeout,
+      Cloud=FLAGS.cloud.lower(),
+      GpuTaintKey=None,  # Only set to 'nvidia.com/gpu' for EKS Karpenter


nit while you're merging conflicts if you get to it: comment is not super needed given fairly obvious naming + logic with if is_eks_karpenter_aws_gpu: kwargs['GpuTaintKey'] = 'nvidia.com/gpu'. Comments are needed to explain complicated code, not simple repetition.

Thanks, removed the comment.

hubatish

waiting on conflict resolution

rsgowman · 2026-01-20T21:53:16Z

perfkitbenchmarker/linux_benchmarks/kubernetes_scale_benchmark.py

+      **manifest_kwargs,
  )
+
+  # Always use ModifyPodSpecPlacementYaml to add nodeSelectors via GetNodeSelectors()


This is generating an internal lint error (line too long; >80 chars)

Updated, now it is 79

Add GPU support to kubernetes_scale on EKS Karpenter

faeca24

rsgowman reviewed Dec 18, 2025

View reviewed changes

Merge branch 'GoogleCloudPlatform:master' into master

84b234b

bvliu requested a review from hubatish January 5, 2026 15:37

Merge branch 'GoogleCloudPlatform:master' into master

2f77176

hubatish reviewed Jan 6, 2026

View reviewed changes

kiryl-filatau and others added 8 commits January 7, 2026 18:03

Move EKS Karpenter GPU NodePool setup to Prepare()

633ce80

Merge branch 'GoogleCloudPlatform:master' into master

71b9e47

Merge branch 'GoogleCloudPlatform:master' into master

e6994b0

Merge branch 'GoogleCloudPlatform:master' into master

b35fce0

Merge branch 'GoogleCloudPlatform:master' into master

8708dbc

Merge branch 'GoogleCloudPlatform:master' into master

1ae0f69

Merge branch 'GoogleCloudPlatform:master' into master

583f0a1

Refactor AWS EKS Karpenter GPU node selector application

2b8985d

hubatish approved these changes Jan 15, 2026

View reviewed changes

vofish and others added 3 commits January 16, 2026 11:50

Merge branch 'GoogleCloudPlatform:master' into master

7bed7ad

Merge branch 'GoogleCloudPlatform:master' into master

561e34a

Fix GPU toleration to apply only to EKS Karpenter clusters

ca8e619

hubatish approved these changes Jan 20, 2026

View reviewed changes

hubatish reviewed Jan 20, 2026

View reviewed changes

hubatish approved these changes Jan 20, 2026

View reviewed changes

kiryl-filatau added 2 commits January 20, 2026 21:46

resolve conflict

d2a20b7

pyink adjustments

d037155

hubatish approved these changes Jan 20, 2026

View reviewed changes

hubatish added the ready to pull label Jan 20, 2026

rsgowman reviewed Jan 20, 2026

View reviewed changes

adjust the style comments

faed023

hubatish approved these changes Jan 20, 2026

View reviewed changes

Merge branch 'GoogleCloudPlatform:master' into master

8c4e933

kiryl-filatau added 2 commits January 21, 2026 18:10

pytype adjustments

4213f64

Merge branch 'master' into feature/kubernetes-scale-to-1-gpu

13cc20a

hubatish approved these changes Jan 21, 2026

View reviewed changes

kiryl-filatau added 3 commits January 21, 2026 20:48

add missed pytz to requirements.txt

326e088

Merge branch 'GoogleCloudPlatform:master' into master

4652938

Merge branch 'master' into feature/kubernetes-scale-to-1-gpu

5597d07

hubatish approved these changes Jan 21, 2026

View reviewed changes

copybara-service bot merged commit 0c3ebb0 into GoogleCloudPlatform:master Jan 22, 2026
2 checks passed

Add GPU support to kubernetes_scale on EKS Karpenter #6313

Add GPU support to kubernetes_scale on EKS Karpenter #6313

Uh oh!

Conversation

kiryl-filatau commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hubatish commented Jan 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hubatish left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kiryl-filatau commented Dec 18, 2025 •

edited

Loading