Skip to content

Python/PyCUDA integration for launching WarpForth kernels #45

@tetsuo-cpp

Description

@tetsuo-cpp

Summary

Create a Python module that loads WarpForth-compiled PTX kernels and launches them with NumPy/PyTorch tensors as arguments. This replaces warpforth-runner for real workloads.

Motivation

The existing warpforth-runner is a standalone C++ tool designed for testing — it takes CSV values on the command line. For real ML workloads, we need to pass large tensors (millions of elements) directly from Python without serialization overhead.

Design

Core API

from warpforth import WarpForthKernel

# Compile and load
kernel = WarpForthKernel("attention.forth")

# Launch with PyTorch tensors (zero-copy via .data_ptr())
kernel.launch(
    Q_gpu, K_gpu, V_gpu, O_gpu,   # GPU tensors
    seq_len, head_dim,             # scalar params
    grid=(seq_len, 1, 1),
    block=(64, 1, 1),
)

Implementation

  • Use PyCUDA's cuda.module_from_buffer() to load PTX
  • Accept both NumPy arrays (copy to GPU) and PyTorch CUDA tensors (zero-copy via data_ptr())
  • Subprocess call to warpforthc for compilation, or accept pre-compiled PTX
  • Parse \! header directives from Forth source to determine parameter types and order
  • Map f64 arrays to float64 device pointers, i64 arrays to int64 device pointers
  • Handle scalar params (pass by value, not pointer)

Parameter mapping

Forth declaration Python input CUDA argument
\! param X f64[N] torch.Tensor (float64, CUDA) Device pointer
\! param X i64[N] torch.Tensor (int64, CUDA) Device pointer
\! param X f64 float Value (bitcast to i64)
\! param X i64 int Value

Files to create

  • demo/warpforth.py — The integration module
  • demo/requirements.txt or pyproject.toml — Dependencies (pycuda, numpy, torch)

Acceptance criteria

  • Can load a WarpForth-compiled PTX kernel
  • Can launch with PyTorch CUDA tensors (zero-copy)
  • Can launch with NumPy arrays (auto-copy to GPU)
  • Correctly handles both array and scalar parameters
  • Works with the attention kernel from Naive attention kernel in Forth #44

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions