Skip to content

[WIP] ByteString Two-Way string matching in indexOfSlice#2324

Draft
pjfanning wants to merge 8 commits intoapache:mainfrom
pjfanning:bs-2waysearch
Draft

[WIP] ByteString Two-Way string matching in indexOfSlice#2324
pjfanning wants to merge 8 commits intoapache:mainfrom
pjfanning:bs-2waysearch

Conversation

@pjfanning
Copy link
Copy Markdown
Member

@pjfanning pjfanning marked this pull request as draft October 18, 2025 16:10
@pjfanning pjfanning changed the title ByteString Two-Way string matching in indexOfSlice [WIP] ByteString Two-Way string matching in indexOfSlice Oct 18, 2025
… array segments match

format

fix issue

more fully copy Netty Two-Way string matching impl

scalafmt

format

Update SWARUtil.scala

Update ByteString.scala
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a Netty-derived two-way string matching algorithm for ByteString.indexOfSlice (and a shared byte-segment matcher), aiming to evaluate potential performance improvements versus existing implementations.

Changes:

  • Add Netty-derived two-way substring search implementation to ByteString1 / ByteString1C indexOfSlice.
  • Introduce optimized byte-segment matching helpers (bytesMatch / arrayBytesMatch) using getLong comparisons.
  • Update licensing files to reflect Netty-derived code usage in ByteString.scala.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
LICENSE Adds ByteString.scala to the list of Netty-derived sources.
legal/pekko-actor-jar-license.txt Mirrors the LICENSE update for the actor JAR legal file.
actor/src/main/scala/org/apache/pekko/util/SWARUtil.scala Makes SWAR helpers final and adds Netty-derived helpers (arrayBytesMatch, maxSuf).
actor/src/main/scala/org/apache/pekko/util/ByteString.scala Adds Netty-derived two-way matching for indexOfSlice and introduces bytesMatch abstraction/optimizations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pjfanning and others added 4 commits April 4, 2026 14:41
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@He-Pin He-Pin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WIP] The Two-Way string matching approach from Netty is a good candidate for optimizing indexOfSlice on large ByteStrings. However:

  1. This is marked WIP and the PR description says 'the aim is to benchmark both solutions'. Without benchmark results showing a clear win, this is hard to evaluate.
  2. The licensing update (adding Netty-derived code to LICENSE) is correctly handled.
  3. This relates to #2323 which takes a simpler approach (just using getLong for byte matching). Consider benchmarking both approaches and merging whichever wins.

Recommendation: Complete the benchmarks and either merge or close based on results. Don't leave as WIP indefinitely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants