Prepare v0.4.0 release #172

LaurenzV · 2025-12-16T08:13:16Z

I think now is a good time to get this out. I double-checked the changelog and only found one other PR worth mentioning.

valadaptive · 2025-12-16T15:30:17Z

Since #159 and #170 are both breaking changes that touch the core of the library, I would've liked to get them in before cutting a release. I suppose it doesn't matter on the technical side of things, and maybe it's better to just land those later and cut another release immediately afterwards. However, if there's a minimum wait time between releases, then I'd like to get both of those in first.

I think fearless_simd is fairly close to dropping the "experimental" warning as well; I believe implementations for all supported architectures are completed now. We should figure out which release we want to officially drop the warning for.

LaurenzV · 2025-12-16T15:45:06Z

I wouldn’t mind cutting a v0.5 right after those two PRs are merged, Injust think it would be good to have a “checkpoint” up until now because as we’ve seen #159 is a bigger change that seems to have performance impacts in certain cases, so it would be good to have this release to fall back on in case we notice other problems (I’m sure it will be fine! but I don’t think it hurts either.)

valadaptive · 2025-12-16T16:27:15Z

Injust think it would be good to have a “checkpoint” up until now because as we’ve seen #159 is a bigger change that seems to have performance impacts in certain cases, so it would be good to have this release to fall back on in case we notice other problems (I’m sure it will be fine! but I don’t think it hurts either.)

The performance impacts of #159 are part of the reason I want to get it in before v0.4. Say the v0.4 release announcement gets posted to Reddit or Hacker News, people go "neat, looks like fearless_simd is actually usable now" and start using it in their libraries and optimizing around it, and then we release v0.5 and everything needs to be re-tuned.

IMO, vello_cpu is probably experiencing weird performance changes because it's already been heavily tuned on the autovectorization-based implementation. I would rather figure out the real causes of the performance regressions. The ones we've tracked down so far are things we probably shouldn't have done in the first place: forgetting to call vectorize, deinterleaving data then immediately re-interleaving it, converting an array to a vector then immediately converting it back to an array.

FWIW, I had to rearrange some code when porting my own project to fearless_simd as well, despite already using native vector types and performing the exact same operations that mapped to the exact same instructions. LLVM just decided to schedule the instructions differently and make things 10% slower. Unfortunately, I think the optimizer is just inherently fickle.

Maybe we could just re-add Level::fallback as a deprecated alias of Level::baseline, and madd/msub as deprecated aliases of mul_add/mul_sub, to make it easier to revert to v0.3 if need be?

Prepare v0.4.0 release

d895215

LaurenzV requested review from DJMcNab and valadaptive December 16, 2025 08:13

LaurenzV marked this pull request as draft December 16, 2025 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prepare v0.4.0 release #172

Prepare v0.4.0 release #172

LaurenzV commented Dec 16, 2025

Uh oh!

valadaptive commented Dec 16, 2025

Uh oh!

LaurenzV commented Dec 16, 2025

Uh oh!

valadaptive commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Prepare v0.4.0 release #172

Are you sure you want to change the base?

Prepare v0.4.0 release #172

Conversation

LaurenzV commented Dec 16, 2025

Uh oh!

valadaptive commented Dec 16, 2025

Uh oh!

LaurenzV commented Dec 16, 2025

Uh oh!

valadaptive commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants