Vulkan backend produces all-zero outputs on PowerVR GPU (Pixel 10 Pro)

## Summary

The Vulkan backend produces **all-zero outputs** on a **PowerVR D-Series GPU** (Google Pixel 10 Pro). The same model files work correctly on macOS via MoltenVK and on Android via XNNPACK.

## Environment

- **Device**: Google Pixel 10 Pro
- **GPU**: PowerVR D-Series DXT-48-1536 MC1 (`maxImageDimension3D = 2048`)
- **ExecuTorch**: Built from source, branch `fix/vulkan-texture-ubo-budget` (my UBO budget fix from PR #17294, rebased on upstream/main as of Feb 7 2026, commit `ba2516cefa`)
- **NDK**: 28.2.13676358 (Clang 19.0.1)
- **Build**: Release, arm64-v8a, XNNPACK + Vulkan backends

## What I Observe

| Platform | Backend | YOLO (v8) | MobileNet |
|----------|---------|-----------|-----------|
| macOS | XNNPACK | Correct | Correct |
| macOS | Vulkan (MoltenVK) | Correct | Correct |
| Android | XNNPACK | Correct | Correct |
| Android | Vulkan (PowerVR) | All zeros | NaN values |

- **YOLO**: Output tensor shape `[1, 84, 8400]` is correct, but all confidence values are exactly `0.0`
- **MobileNet**: Output contains NaN values
- Both `texture_limits: (2048, 2048, 2048)` and `storage_type_override: BUFFER` produce the same zero results

## What I Found by Adding Tracing

I added `__android_log_print` traces to `VulkanBackend.cpp`, `ComputeGraph.cpp`, and `StagingBuffer.cpp` to narrow things down. Key findings:

### 1. GPU is PowerVR — no support in ExecuTorch

```
GPU: name='powervr d-series dxt-48-1536 mc1', max3D=2048
```

ExecuTorch's Vulkan backend only handles Adreno, Mali, NVIDIA, and SwiftShader. There is zero PowerVR-specific handling.

### 2. Output texture extents exceed `maxImageDimension3D`

```
output_tensor[3] extents=(2100,64,1)  max3D=2048 EXCEEDS=1
output_tensor[4] extents=(2100,4,16)  max3D=2048 EXCEEDS=1
```

2100 = 8400 anchors / 4 (texel packing). The detection head outputs also exceed the limit. This is only checked in `#ifdef VULKAN_DEBUG` builds (`Tensor.cpp:632-660`), so release builds silently hit undefined behavior.

I exported with `texture_limits: (2048, 2048, 2048)` in VulkanPartitioner, but this only controls which ops get delegated — it doesn't account for texel packing that turns dimension 8400 into texture extent 2100.

### 3. Even in-limit tensors are zero

Output tensors 0-2 have extents within the 2048 limit (e.g., `(80,80,16)`, `(40,40,32)`, `(20,20,64)`), but they are also all zeros. This suggests either intermediate tensors also exceed limits, or one bad texture corrupts the entire command buffer state on PowerVR.

### 4. Execution mechanics work fine

- 332 nodes encode and submit, fence waits successfully
- Staging buffers have correct memory flags (`HOST_VISIBLE | HOST_COHERENT | DEVICE_LOCAL`)
- Input data is valid (verified non-zero values in staging buffer)
- GPU "completes" work but staging buffers read back all zeros

### 5. Single command buffer didn't help

I tried forcing all dispatches into one command buffer (setting `execute_threshold_node_count` to `UINT32_MAX`). Same result — all zeros.

## Related

- **PR #17294** — my fix for a separate UBO budget crash (`uniform data allocation has exceeded tensor uniform buffer size`). That fix prevents a crash but does not affect the zero-output issue.
- **`cases.py:1464-1470`** — there's an existing TODO noting Android arm64 failures where "writes from the first or second shader dispatch being 'ignored'" which matches my symptoms exactly.

## Questions

1. Is PowerVR expected to work at all with the Vulkan backend? Or is it currently untested/unsupported?
2. Could the `texture_limits` partitioner option be made aware of texel packing so it avoids delegating ops whose packed extents exceed `maxImageDimension3D`?
3. Should the texture extent check in `Tensor.cpp:632-660` be enabled in release builds (not just debug)?
4. Any other suggestions for debugging this? Happy to add more traces or test patches.

cc @SS-JIA @manuelcandales @digantdesai @cbilgin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulkan backend produces all-zero outputs on PowerVR GPU (Pixel 10 Pro) #17299

Summary

Environment

What I Observe

What I Found by Adding Tracing

1. GPU is PowerVR — no support in ExecuTorch

2. Output texture extents exceed `maxImageDimension3D`

3. Even in-limit tensors are zero

4. Execution mechanics work fine

5. Single command buffer didn't help

Related

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Platform	Backend	YOLO (v8)	MobileNet
macOS	XNNPACK	Correct	Correct
macOS	Vulkan (MoltenVK)	Correct	Correct
Android	XNNPACK	Correct	Correct
Android	Vulkan (PowerVR)	All zeros	NaN values

Vulkan backend produces all-zero outputs on PowerVR GPU (Pixel 10 Pro) #17299

Description

Summary

Environment

What I Observe

What I Found by Adding Tracing

1. GPU is PowerVR — no support in ExecuTorch

2. Output texture extents exceed maxImageDimension3D

3. Even in-limit tensors are zero

4. Execution mechanics work fine

5. Single command buffer didn't help

Related

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2. Output texture extents exceed `maxImageDimension3D`