[ET Device Support] DeviceAllocator interface and DeviceAllocatorRegistry by Gasoonjia · Pull Request #19496 · pytorch/executorch

Gasoonjia · 2026-05-12T04:35:11Z

Stack from ghstack (oldest at bottom):

[ET Device Support] MethodMeta: expose per-buffer device placement API #18474
[ET Device Support] DeviceMemoryBuffer RAII class for device memory lifetime management #18473
[ET Device Support] Emitter reads non_const_buffer_device from graph meta #18472
[ET Device Support] Device-aware memory planning: separate buffers per device type #18375
[ET Device Support] Add NonConstBufferDevice schema for per-buffer device mapping #19497
-> [ET Device Support] DeviceAllocator interface and DeviceAllocatorRegistry #19496

This diff introduces the DeviceAllocator abstract interface and DeviceAllocatorRegistry for device-specific memory allocation. This is a foundational abstraction that enables the runtime to dispatch memory operations to the appropriate device backend other than CPU (CUDA, etc.).

DeviceAllocator interface provides:

allocate() / deallocate() - Dynamic device memory allocation
copy_host_to_device() / copy_device_to_host() - Data transfer between host and device
device_type() - Returns the device type this allocator handles

DeviceAllocatorRegistry provides:

Singleton registry mapping DeviceType → DeviceAllocator
register_allocator() / get_allocator() methods
Fixed-size array indexed by device type (no dynamic allocation, embedded-friendly)

Design notes:

Registry stores raw pointers (non-owning) - allocators are expected to be singletons with static lifetime
Follows ExecuTorch's embedded-first philosophy (no std::unique_ptr, no heap allocation in registry)
Convenience free functions register_device_allocator() and get_device_allocator() for ease of use

Differential Revision: D93635656

…stry This diff introduces the `DeviceAllocator` abstract interface and `DeviceAllocatorRegistry` for device-specific memory allocation. This is a foundational abstraction that enables the runtime to dispatch memory operations to the appropriate device backend other than CPU (CUDA, etc.). **DeviceAllocator interface provides:** - `allocate()` / `deallocate()` - Dynamic device memory allocation - `copy_host_to_device()` / `copy_device_to_host()` - Data transfer between host and device - `device_type()` - Returns the device type this allocator handles **DeviceAllocatorRegistry provides:** - Singleton registry mapping DeviceType → DeviceAllocator - `register_allocator()` / `get_allocator()` methods - Fixed-size array indexed by device type (no dynamic allocation, embedded-friendly) **Design notes:** - Registry stores raw pointers (non-owning) - allocators are expected to be singletons with static lifetime - Follows ExecuTorch's embedded-first philosophy (no std::unique_ptr, no heap allocation in registry) - Convenience free functions `register_device_allocator()` and `get_device_allocator()` for ease of use Differential Revision: [D93635656](https://our.internmc.facebook.com/intern/diff/D93635656/) [ghstack-poisoned]

pytorch-bot · 2026-05-12T04:35:15Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19496

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull jobs on OSDC in pull requests shadow mode

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-05-12T04:35:59Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@Gasoonjia

…stry (#19498) This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: #19496 by @Gasoonjia ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/gasoonjia/167/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/gasoonjia/167/head Merge bot PR base: https://github.com/pytorch/executorch/tree/main Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/gasoonjia/167/orig Differential Revision: [D93635656](https://our.internmc.facebook.com/intern/diff/D93635656/) @diff-train-skip-merge Co-authored-by: gasoonjia <gasoonjia@icloud.com>

…vice mapping (#19497) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #18474 * #18473 * #18472 * #18375 * __->__ #19497 * #19496 Adds the NonConstBufferDevice table to the FlatBuffer schema (program.fbs) and the corresponding Python dataclass to schema.py. This enables mapping each non-constant planned memory buffer to a specific device type (CPU, CUDA, etc.). The field is optional and absent for CPU-only programs, ensuring zero binary size regression. Differential Revision: [D97335597](https://our.internmc.facebook.com/intern/diff/D97335597/)

…r device type (#18375) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #18474 * #18473 * #18472 * __->__ #18375 * #19497 * #19496 Extends memory planning to separate device tensors from CPU tensors into distinct memory buffers. Non-CPU TensorSpecs (e.g., CUDA) are pre-assigned device-specific mem_ids before the greedy/naive algorithm runs, ensuring they get planned into independent memory buffers that never share space with CPU tensors. Differential Revision: [D97447105](https://our.internmc.facebook.com/intern/diff/D97447105/)

Gasoonjia requested review from JacobSzwejbka and lucylq as code owners May 12, 2026 04:35

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 12, 2026

Gasoonjia merged commit 664abf8 into gh/gasoonjia/167/base May 12, 2026
163 of 166 checks passed

Gasoonjia deleted the gh/gasoonjia/167/head branch May 12, 2026 04:41

Gasoonjia temporarily deployed to cherry-pick-bot May 12, 2026 04:41 — with GitHub Actions Inactive

pytorchbot mentioned this pull request May 12, 2026

[ET Device Support] DeviceAllocator interface and DeviceAllocatorRegistry #19498

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET Device Support] DeviceAllocator interface and DeviceAllocatorRegistry#19496

[ET Device Support] DeviceAllocator interface and DeviceAllocatorRegistry#19496
Gasoonjia merged 1 commit into
gh/gasoonjia/167/basefrom
gh/gasoonjia/167/head

Gasoonjia commented May 12, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Gasoonjia commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 12, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19496

❗ 1 Active SEVs

Uh oh!

github-actions Bot commented May 12, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Gasoonjia commented May 12, 2026 •

edited

Loading

This PR needs a `release notes:` label