Skip to content

Rewrite sigmoid gradient into numerically stable form#2041

Open
ricardoV94 wants to merge 1 commit intopymc-devs:v3from
ricardoV94:expit_grad
Open

Rewrite sigmoid gradient into numerically stable form#2041
ricardoV94 wants to merge 1 commit intopymc-devs:v3from
ricardoV94:expit_grad

Conversation

@ricardoV94
Copy link
Copy Markdown
Member

@ricardoV94 ricardoV94 commented Apr 10, 2026

Replace sigmoid(x) * (1 - sigmoid(x)) with sigmoid(x) * sigmoid(-x) in the Sigmoid pullback. The naive form suffers catastrophic cancellation for large |x| because (1 - expit(x)) rounds to zero.

Instead of doing that (which we may want to). I left as is but let the stabilize rewrite be more aggressive and rewrite 1-sigmoid(x) -> sigmoid(-x), even if sigmoid(x) is used elsewhere. (Users who don't care about this can exclude "stabilize" then)

This may be too much tip-toeing. Maybe we want the rewrite to always be eager (and in this case implement the pullback already in this format). There was one test that checked whether the grad of a naive log(1 - sigmoid(x)) simplified (to not have a sum), and that one ended up cancelling a sigmoid(x) / sigmoid(x), that an eager stable pullback didn't produce. (rewrite ordering is fun).

I don't know if sigmoids are expensive enough to worry about duplicating use in the first place.

Enable allow_multiple_clients on the 1-sigmoid(x)->sigmoid(-x) rewrite
so it fires even when sigmoid(x) has other consumers. This stabilizes
expressions like sigmoid(x) * (1 - sigmoid(x)) which suffer catastrophic
cancellation for large |x|.

The sigmoid pullback is kept in naive form to preserve algebraic
cancellation in composed expressions like log(1 - sigmoid(x)).
@ricardoV94 ricardoV94 changed the title Use numerically stable form for sigmoid gradient Rewrite sigmoid gradient into numerically stable form Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant