Float math intrinsics: FEXP, FSQRT, FLOG, FABS, FNEG, FMAX, FMIN

## Summary

Add floating-point math intrinsic words that lower to MLIR `math` dialect operations (and ultimately to NVVM/libdevice intrinsics on GPU).

## Words to implement

### Unary operations

| Word | Stack effect | MLIR op | Description |
|------|-------------|---------|-------------|
| `FEXP` | `( f -- f )` | `math.exp` | Exponential (e^x) |
| `FSQRT` | `( f -- f )` | `math.sqrt` | Square root |
| `FLOG` | `( f -- f )` | `math.log` | Natural logarithm |
| `FABS` | `( f -- f )` | `math.absf` | Absolute value |
| `FNEG` | `( f -- f )` | `arith.negf` | Negation |

### Binary operations

| Word | Stack effect | MLIR op | Description |
|------|-------------|---------|-------------|
| `FMAX` | `( f f -- f )` | `arith.maximumf` | Maximum of two floats |
| `FMIN` | `( f f -- f )` | `arith.minimumf` | Minimum of two floats |

## Motivation

- **FEXP + FMAX**: Required for softmax (`exp(x - max)`) — blocks flash attention, transformer kernels, and any probability-based computation.
- **FSQRT**: Required for score scaling (`1/sqrt(d_k)`) in attention, and for normalization (LayerNorm, RMSNorm).
- **FLOG**: Log-space softmax, cross-entropy loss.
- **FABS**: Numerical stability checks, absolute error computation.
- **FNEG**: Cleaner than `0.0 SWAP F-`; needed for initializing accumulators to `-inf` patterns.
- **FMAX/FMIN**: Online reductions (running max/min across tiles), clamping values.

## Implementation notes

- All follow the same pattern as existing float ops (`F+`, `F-`, etc.): bitcast i64↔f64 around the math op.
- Unary ops: pop one value, bitcast to f64, apply math op, bitcast back, push result.
- Binary ops (FMAX/FMIN): pop two values, bitcast both to f64, apply op, bitcast result back, push.
- MLIR's `math` dialect lowers to LLVM intrinsics, which NVVM maps to libdevice calls (e.g., `__nv_exp`, `__nv_sqrt`).
- FNEG uses `arith.negf` rather than `math` dialect.
- FMAX/FMIN use `arith.maximumf`/`arith.minimumf` (IEEE 754 semantics: propagate NaN).

## Files to modify

1. `include/warpforth/Dialect/Forth/ForthOps.td` — Define new ops
2. `lib/Translation/ForthToMLIR/ForthToMLIR.cpp` — Parse words
3. `lib/Conversion/ForthToMemRef/ForthToMemRef.cpp` — Add conversion patterns
4. `test/Translation/Forth/` — Parser tests
5. `test/Conversion/ForthToMemRef/` — Conversion tests
6. `test/Pipeline/` — End-to-end pipeline tests

## Priority

High — blocks flash attention and most non-trivial GPU compute kernels.

## Related

- #10 — Warp-level primitives (needed together for performant reductions)
- #11 — Tensor core intrinsics (complementary GPU capability)

Word	Stack effect	MLIR op	Description
`FEXP`	`( f -- f )`	`math.exp`	Exponential (e^x)
`FSQRT`	`( f -- f )`	`math.sqrt`	Square root
`FLOG`	`( f -- f )`	`math.log`	Natural logarithm
`FABS`	`( f -- f )`	`math.absf`	Absolute value
`FNEG`	`( f -- f )`	`arith.negf`	Negation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Float math intrinsics: FEXP, FSQRT, FLOG, FABS, FNEG, FMAX, FMIN #42

Summary

Words to implement

Unary operations

Binary operations

Motivation

Implementation notes

Files to modify

Priority

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Word	Stack effect	MLIR op	Description
`FMAX`	`( f f -- f )`	`arith.maximumf`	Maximum of two floats
`FMIN`	`( f f -- f )`	`arith.minimumf`	Minimum of two floats

Float math intrinsics: FEXP, FSQRT, FLOG, FABS, FNEG, FMAX, FMIN #42

Description

Summary

Words to implement

Unary operations

Binary operations

Motivation

Implementation notes

Files to modify

Priority

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions