[BugFix][Target][LLVM] Route sinh/cosh/atan/asinh/erf through libm extern by swjng · Pull Request #19568 · apache/tvm

swjng · 2026-05-15T07:37:35Z

Summary

Six LLVM legalize rules in src/target/llvm/intrin_rule_llvm.cc use inline mathematical identities that fail on representable inputs because the intermediate computation overflows or cancels, even though the true result is in float32 range:

Op	Inline form	Failure	True result
`sinh`/`cosh` (#19559)	`(exp(x) ± exp(-x)) / 2`	`exp(89) > FLT_MAX`, intermediate is `inf`	`sinh(89) ≈ 2.24e38`
`atan` (#19560)	`asin(x / sqrt(x²+1))`	`x²` overflows for `	x
`asinh` (#19561)	`log(x + sqrt(x²+1))`	same `x²` overflow → `log(inf)=inf`	`asinh(3e22) ≈ 52.45`
`erf` (#19562)	A&S `1 − poly(t)·exp(−x²)`	`poly·exp(−x²) ≈ 1` for tiny `	x
`acosh` (no issue)	`log(x + sqrt(x²−1))`	same `x²` overflow → `inf`	`acosh(3e22) ≈ 52.45`

acosh was not in the original issue cluster but shows the identical bug pattern to asinh; folding it in keeps this PR's scope consistent ("naive math identity → libm extern"). Happy to split it out if reviewers prefer.

Fix

Route all six through the existing DispatchPureExtern<FloatSuffix> helper — i.e. sinhf, coshf, atanf, asinhf, acoshf, erff — the same pattern asin/acos use after #19567. ULP-grade accuracy across the reported ranges.

sinh(89.0):    ORT=2.244806e+38  TVM=2.244806e+38  (was inf)
atan(3e22):    ORT=1.5707964     TVM=1.5707963     (was 0.0)
asinh(3e22):   ORT=52.44863      TVM=52.44863      (was inf)
acosh(3e22):   ORT=52.44863      TVM=52.44863      (was inf)
erf(3e-12):    ORT=3.385e-12     TVM=3.385e-12     (was 0.0)

Atan is re-enabled in test_unary; the overflow that previously broke it is fixed.

Notes for reviewers

Inline-vs-extern decision. If the inline identities were a deliberate fast-path (e.g. for autovectorization or to avoid extern-call overhead in tight loops), please flag it and I'll switch to stable inline forms instead — exp(x − ln 2) ± exp(−x − ln 2) for sinh/cosh, range-reduced asinh/acosh sign(x)·log(2|x|) for large |x|, small-|x| Taylor branch for erf, etc. I could not find evidence of such intent in the git history (sinh/cosh: original commit; atan/asinh/acosh: #17945 / #17969 follow-ups; erf: #18104 was framed as "more precise than tanh-approx", not "fast inline").

Fixes #19559.
Fixes #19560.
Fixes #19561.
Fixes #19562.

gemini-code-assist

Code Review

This pull request refactors the LLVM legalization rules for several intrinsic operations, including cosh, sinh, atan, asinh, and erf, by replacing manual mathematical expansions with a unified DispatchPureExtern call. Additionally, the Atan operator has been enabled in the ONNX frontend tests. The reviewer pointed out that the using namespace intrin; statements, local call variable assignments, and manual null checks are redundant in the new implementation because DispatchPureExtern is fully qualified and performs its own internal validation.

ConvolutedDog · 2026-05-18T02:38:09Z

-      PrimExpr exp_posx = exp(x);
-      PrimExpr ret = (exp_posx + exp_negx) / two;
-      return ret;
+      return ::tvm::codegen::intrin::DispatchPureExtern<::tvm::codegen::intrin::FloatSuffix>(e);


I have a question: default.FLowerIntrinsic already handles conversion for these operations, wouldn't simply deleting the llvm.FLegalize registrations achieve the same result as adding redundant DispatchPureExtern rules in intrin_rule_llvm.cc? Are there any other considerations?

tvm/src/target/intrin_rule.cc

Lines 78 to 79 in 4b93f20

TVM_REGISTER_OP("tirx.cosh")

.set_attr<FLowerIntrinsic>("default.FLowerIntrinsic", DispatchPureExtern<FloatSuffix>);

You're right, I didn't check the fallback.

…tern Five LLVM legalize rules used inline mathematical identities that fail on representable inputs because the intermediate computation overflows or cancels, even though the true result is in range: - `sinh`/`cosh`: `(exp(x) ± exp(-x)) / 2` — `exp(89) > FLT_MAX` so `exp(x)` itself overflows. True `sinh(89) ≈ 2.24e38` fits in float32. (apache#19559) - `atan`: `asin(x / sqrt(x*x + 1))` — `x*x` overflows for `|x| > sqrt(FLT_MAX) ≈ 1.84e19`, then `sqrt(inf) = inf`, `x / inf = 0`, `asin(0) = 0`. (apache#19560) - `asinh`: `log(x + sqrt(x*x + 1))` — same `x*x` overflow. True `asinh(3e22) ≈ 52.45`. (apache#19561) - `erf`: Abramowitz–Stegun `1 - poly(t) * exp(-x*x)` — for small `|x|`, `poly * exp(-x*x) ≈ 1` and the subtraction cancels to 0 in float32, flushing `erf(3e-12)` to 0 instead of the true `~3.4e-12`. (apache#19562) All four route through the existing `DispatchPureExtern<FloatSuffix>` helper (i.e. `sinhf`, `coshf`, `atanf`, `asinhf`, `erff`), the same pattern already used by `asin`/`acos`. ULP-grade accuracy across representative ranges; `Atan` is re-enabled in `test_unary` since the overflow that previously broke it is fixed. Note for reviewers: if the inline identities were a deliberate fast-path (e.g. for autovectorization or to avoid extern call overhead in tight loops), please flag it and I'll switch to stable inline forms (`exp(x − ln 2) ± exp(−x − ln 2)` for sinh/cosh, range-reduced asinh, small-x Taylor for erf, etc.). I could not find evidence of such intent in the git history. Acosh shows the same `sqrt(x*x − 1)` overflow pattern but is not covered by any of the listed issues; happy to include it as a follow-up if maintainers want. Fixes apache#19559. Fixes apache#19560. Fixes apache#19561. Fixes apache#19562.

`acosh` has the same `sqrt(x*x - 1)` overflow pattern as `asinh`: intermediate `x*x` overflows float32 for `|x| > sqrt(FLT_MAX) ≈ 1.84e19`, so `sqrt(inf) = inf`, `log(x + inf) = inf`, while the true result `acosh(3e22) ≈ 52.45` is well within range. No issue was filed for this op but the bug is identical to apache#19561 and the fix is the same.

…rappers Per review feedback: with the inline math identities gone, the `DispatchPureExtern<FloatSuffix>` call is fully qualified (`::tvm::codegen::intrin::...`) and the `using namespace intrin;` line inside each lambda no longer brings anything into scope. Drop it from the six ops touched in this PR (sinh, cosh, atan, asinh, acosh, erf). The `CallNode* call` ICHECK is kept for parity with the rest of the file (every legalize lambda in this translation unit performs that check).

Delete llvm.FLegalize registrations for sinh, cosh, atan, asinh, acosh, and erf that simply forwarded to DispatchPureExtern. The default.FLowerIntrinsic fallback in intrin_rule.cc already provides identical behavior, so these entries were redundant.

gemini-code-assist Bot reviewed May 15, 2026

View reviewed changes

ConvolutedDog reviewed May 18, 2026

View reviewed changes

swjng added 4 commits May 18, 2026 16:32

swjng force-pushed the fix/llvm-math-overflow branch from c3bfcea to aba2270 Compare May 18, 2026 07:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix][Target][LLVM] Route sinh/cosh/atan/asinh/erf through libm extern#19568

[BugFix][Target][LLVM] Route sinh/cosh/atan/asinh/erf through libm extern#19568
swjng wants to merge 4 commits into
apache:mainfrom
swjng:fix/llvm-math-overflow

swjng commented May 15, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ConvolutedDog May 18, 2026

Uh oh!

swjng May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	TVM_REGISTER_OP("tirx.cosh")
	.set_attr<FLowerIntrinsic>("default.FLowerIntrinsic", DispatchPureExtern<FloatSuffix>);

Conversation

swjng commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Notes for reviewers

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ConvolutedDog May 18, 2026

Choose a reason for hiding this comment

Uh oh!

swjng May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

swjng commented May 15, 2026 •

edited

Loading