Skip to content

Collapse IMessageStore to a single AppendMessages entry point#169

Open
stidsborg wants to merge 3 commits into
mainfrom
simplify-get-messages-for-replica
Open

Collapse IMessageStore to a single AppendMessages entry point#169
stidsborg wants to merge 3 commits into
mainfrom
simplify-get-messages-for-replica

Conversation

@stidsborg

@stidsborg stidsborg commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Summary

Removes IMessageStore.AppendMessage (singular) and routes every message append through AppendMessages, collapsing the message store to a single append entry point.

The two production callers that consumed AppendMessage's returned ReplicaIdMessageWriter.AppendMessage and InvocationHelper.PublishCompletionMessageToParent — used it only to decide whether to interrupt/schedule the target flow. Since AppendMessages already interrupts every distinct target, that conditional becomes unconditional. This matches the documented intent that the interrupt is the suspend-race guard and watchdog backstop "even when the target is owned by another replica" — the singular path previously skipped it for cross-replica targets.

Behavioral note

Appending a message now always interrupts the target flow:

  • MessageWriter: previously interrupted only when the message landed on the publisher's replica; now always (safe superset — cross-replica delivery is unchanged via the owning replica's MessageWatchdog).
  • ExistingMessages.Append: previously never interrupted; now does.

Changes

  • Interface: drop Task<ReplicaId> AppendMessage(StoredId, StoredMessage) from IMessageStore.
  • Stores: remove the duplicate single-row INSERT from PostgreSQL/SqlServer/MariaDB; InMemoryFunctionStore inlines the per-message append into AppendMessages.
  • Callers: MessageWriter, InvocationHelper, ExistingMessages now call AppendMessages([...]).
  • Dead-plumbing cleanup: the singular path was the only reason IFlowsManager was threaded into MessageWriterMessageWritersInvocationHelper; that dependency (and the FunctionsRegistry wiring) is removed.
  • Tests: a test-only AppendMessage extension forwards to AppendMessages, so existing message-store tests keep exercising the single-append path.

Lock-contention fix (SqlServer + MariaDB)

Routing single appends through AppendMessages initially merged the INSERT and the interrupt UPDATE into one command/connection in the SqlServer and MariaDB stores. Holding the insert's locks while the interrupt UPDATE ran deadlocked tight message-exchange loops (ping-pong) where two executing flows interrupt each other — SQL Server's unbounded lock-wait hung indefinitely. Fixed by executing the interrupt as a separate command after the insert (matching the previous append-then-schedule design). PostgreSQL already runs them as separate auto-committing batch commands and is unaffected.

Testing

  • In-memory suite: 518/518 passed
  • PostgreSQL store suite: 380/380 passed
  • MariaDB store suite: 380/380 passed
  • SqlServer store suite: 379 passed; PingPongMessagesCanBeExchangedMultipleTimes flaky-hangs only under full-suite load. Verified this hang is pre-existing on main/base (the base commit hangs on the same test in the same way) — it is the known load-dependent idle-target-restart race, not introduced here. The test passes reliably when run on its own on this branch.

Note

This branch also carries the previously unmerged commit 151de337 (Simplify GetMessagesForReplica to List and add ignorePositions), per the decision to stack rather than split into a separate PR.

…Positions

Return a List<StoredMessages> (StoredId + its messages) instead of a
Dictionary<StoredId, List<StoredMessage>>, and accept an ignorePositions
parameter so already-pushed messages are not re-delivered.

MessageWatchdog (version 1) keeps an ever-growing set of pushed positions
and passes it each tick; this will be refined once the QueueManager reports
the positions it has persisted into its effects.

Stores filter the ignore-set via a stable, parameterized query to avoid
plan-cache churn: Postgres '!= ALL($2)', SQL Server STRING_SPLIT, MariaDB
FIND_IN_SET. Adds a cross-store test covering empty and non-empty ignore sets.
…essages

Collapses the message store to a single append entry point. The two callers
that consumed AppendMessage's returned ReplicaId (MessageWriter and
InvocationHelper.PublishCompletionMessageToParent) used it only to decide
whether to interrupt/schedule the target flow. AppendMessages already
interrupts every distinct target inside its batch, so the interrupt is now
unconditional - which matches the documented intent that the interrupt is the
suspend-race guard and watchdog backstop "even when the target is owned by
another replica".

This removes the now-dead IFlowsManager dependency from MessageWriter,
MessageWriters and InvocationHelper (and the corresponding FunctionsRegistry
wiring). The three SQL store implementations drop their duplicate single-row
INSERT; InMemoryFunctionStore inlines the per-message append into
AppendMessages. A test-only AppendMessage extension forwards to AppendMessages
so existing message-store tests keep exercising the single-append path.
…contention

Routing single appends through AppendMessages merged the INSERT and the
interrupt UPDATE into one command/connection in the SqlServer and MariaDB
stores. Holding the insert's locks while the interrupt UPDATE ran deadlocked
tight message-exchange loops (ping-pong) where two executing flows interrupt
each other - SQL Server's unbounded lock-wait made it hang indefinitely.

Execute the interrupt as a separate command after the insert (matching the
previous append-then-schedule design). PostgreSQL already runs them as
separate auto-committing batch commands and is unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant