Skip to content

Add GPU fallback support via unified compute field#65

Open
Hkhan161 wants to merge 1 commit intomainfrom
harris/gpu-fallbacks
Open

Add GPU fallback support via unified compute field#65
Hkhan161 wants to merge 1 commit intomainfrom
harris/gpu-fallbacks

Conversation

@Hkhan161
Copy link
Copy Markdown
Contributor

@Hkhan161 Hkhan161 commented Apr 12, 2026

Summary

compute now accepts either a string or array in the TOML:

compute = "HOPPER_H100"                                       # single GPU
compute = ["HOPPER_H100", "HOPPER_H200", "AMPERE_A100_80GB"]  # with fallbacks

First element is the primary GPU, rest are fallbacks. The CLI normalizes both forms and sends compute as an array (or string) to the backend API. No separate compute_fallbacks field exposed to users.

Changes (3 files)

  • config.go: ComputeRaw interface{} for TOML input, Compute *string unchanged, added ComputeFallbacks []string. Payload sends array when fallbacks present.
  • loader.go: normalizeCompute() splits string/array into Compute + ComputeFallbacks
  • validator.go: Validates primary and all fallbacks against the compute enum

No changes to

  • deploy.go, run.go, deploy_test.goCompute stays *string, all existing code untouched

Backward compatible

  • compute = "H100" (string) works exactly as before
  • Omitting fallbacks changes nothing

Companion backend PR: CerebriumAI/dashboard-backend#3432

Test plan

  • All tests pass (go test ./...)
  • Deployed to dev with array syntax, verified ksvc manifest
  • Deployed to dev with string syntax, verified backward compat
  • Deployed to dev with no tier/no fallbacks, verified default behavior

Copy link
Copy Markdown
Contributor

@elijah-rou elijah-rou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah same comment as backend, another param seems bloaty here

@Hkhan161 Hkhan161 force-pushed the harris/gpu-fallbacks branch 6 times, most recently from 888b060 to 2a33baf Compare April 15, 2026 01:02
@Hkhan161 Hkhan161 changed the title Add compute_fallbacks support for GPU fallback scheduling Add GPU fallback support via unified compute field Apr 15, 2026
compute now accepts either a string or array in the TOML:

  compute = "HOPPER_H100"
  compute = ["HOPPER_H100", "HOPPER_H200", "AMPERE_A100_80GB"]

First element is primary, rest are fallbacks. CLI normalizes both
forms and sends compute as array to the backend API.

Changes:
- config.go: ComputeRaw (interface{}) for TOML, Compute (*string) unchanged
- loader.go: normalizeCompute() splits string/array into Compute + ComputeFallbacks
- validator.go: validates primary and fallbacks against compute enum

No changes to deploy.go, run.go, or tests — Compute stays *string.

Companion backend PR: CerebriumAI/dashboard-backend#3432

Made-with: Cursor
@Hkhan161 Hkhan161 force-pushed the harris/gpu-fallbacks branch from 2a33baf to b0155a0 Compare April 15, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants