Skip to content

feat: background workers = non-HTTP workers with shared state#2287

Open
nicolas-grekas wants to merge 2 commits intophp:mainfrom
nicolas-grekas:sidekicks
Open

feat: background workers = non-HTTP workers with shared state#2287
nicolas-grekas wants to merge 2 commits intophp:mainfrom
nicolas-grekas:sidekicks

Conversation

@nicolas-grekas
Copy link
Copy Markdown
Contributor

@nicolas-grekas nicolas-grekas commented Mar 16, 2026

Summary

Background workers are long-running PHP workers that run outside the HTTP cycle. They observe their environment (Redis, DB, filesystem, etc.) and publish variables that HTTP workers read per-request - enabling real-time reconfiguration without restarts or polling.

PHP API

  • frankenphp_set_vars(array $vars): void - publishes vars from a background worker (persistent memory, cross-thread). Skips all work when data is unchanged (=== check).
  • frankenphp_get_vars(string|array $name, float $timeout = 30.0): array - reads vars from any worker context (blocks until first publish, generational cache)
  • frankenphp_get_worker_handle(): resource - returns a pipe-based stream for stream_select() integration. Closed on shutdown (EOF = stop).

Caddyfile configuration

php_server {
    # HTTP worker (unchanged)
    worker public/index.php { num 4 }

    # Named background worker (auto-started if num >= 1)
    worker bin/worker.php {
        background
        name config-watcher
        num 1
    }

    # Catch-all for lazy-started names
    worker bin/worker.php {
        background
    }
}
  • background marks a worker as non-HTTP
  • name specifies an exact worker name; workers without name are catch-all for lazy-started names
  • Not declaring a catch-all forbids lazy-started ones
  • max_threads on catch-all sets a safety cap for lazy-started instances (defaults to 16)
  • max_consecutive_failures defaults to 6 (same as HTTP workers)
  • max_execution_time automatically disabled for background workers
  • Each php_server block has its own isolated scope (managed by NextBackgroundWorkerScope())

Shutdown

On restart/shutdown, the signaling stream is closed. Workers detect this via fgets() returning false (EOF). Workers have a 5-second grace period.

After the grace period, a best-effort force-kill is attempted:

  • Linux ZTS: arms PHP's own max_execution_time timer cross-thread via timer_settime(EG(max_execution_timer_timer))
  • Windows: CancelSynchronousIo + QueueUserAPC interrupts blocking I/O and alertable waits
  • macOS: no per-thread mechanism available; stuck threads are abandoned

During the restart window, get_vars returns the last published data (stale but available). A warning is logged on crash.

Forward compatibility

The signaling stream is forward-compatible with the PHP 8.6 poll API RFC. Poll::addReadable accepts stream resources directly - code written today with stream_select will work on 8.6 with Poll, no API change needed.

Architecture

  • Per-php_server scope isolation with internal registry (unexported types, minimal public API via NextBackgroundWorkerScope())
  • Dedicated backgroundWorkerThread handler implementing threadHandler interface - decoupled from HTTP worker code paths
  • drain() closes the signaling stream (EOF) for clean shutdown signaling
  • Persistent memory (pemalloc) with RWMutex for safe cross-thread sharing
  • set_vars skip: uses PHP's === (zend_is_identical) to detect unchanged data - skips validation, persistent copy, write lock, and version bump
  • Generational cache: per-thread version check skips lock + copy when data hasn't changed; repeated get_vars calls return the same array instance (=== is O(1))
  • Opcache immutable array zero-copy fast path (IS_ARRAY_IMMUTABLE)
  • Interned string optimizations (ZSTR_IS_INTERNED) - skip copy/free for shared memory strings
  • Rich type support: null, scalars, arrays (nested), enums
  • Crash recovery with exponential backoff and automatic restart
  • Background workers integrate with existing worker infrastructure (scaling, thread management)
  • $_SERVER['FRANKENPHP_WORKER_NAME'] set for background workers
  • $_SERVER['FRANKENPHP_WORKER_BACKGROUND'] set for all workers (true/false)

Example

// Background worker: polls Redis every 5s
$stream = frankenphp_get_worker_handle();
$redis = new Redis();
$redis->connect('127.0.0.1');

frankenphp_set_vars([
    'maintenance' => (bool) $redis->get('maintenance_mode'),
    'feature_flags' => json_decode($redis->get('features'), true),
]);

while (true) {
    $r = [$stream]; $w = $e = [];
    if (false === @stream_select($r, $w, $e, 5)) { break; }
    if ($r && false === fgets($stream)) { break; } // EOF = stop

    frankenphp_set_vars([
        'maintenance' => (bool) $redis->get('maintenance_mode'),
        'feature_flags' => json_decode($redis->get('features'), true),
    ]);
}
// HTTP worker
$config = frankenphp_get_vars('config-watcher');
if ($config['maintenance']) {
    return new Response('Down for maintenance', 503);
}

Test coverage

17 unit tests + 1 Caddy integration test covering: basic vars, at-most-once start, validation, type support (enums, binary-safe strings), multiple background workers, multiple entrypoints, crash restart, signaling stream, worker restart lifecycle, non-background-worker error handling, identity detection, generational cache, named auto-start with m# prefix.

All tests pass on PHP 8.2, 8.3, 8.4, and 8.5 with -race. Zero memory leaks on PHP debug builds.

Documentation

Full docs at docs/background-workers.md.

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 4 times, most recently from e1655ab to 867e9b3 Compare March 16, 2026 20:26
@AlliBalliBaba
Copy link
Copy Markdown
Contributor

AlliBalliBaba commented Mar 16, 2026

Interesting approach to parallelism, what would be a concrete use case for only letting information flow one way from the sidekick to the http workers?

Usually the flow would be inverted, where a http worker offloads work to a pool of 'sidekick' workers and can optionally wait for a task to complete.

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 2 times, most recently from da54ab8 to a06ba36 Compare March 16, 2026 21:45
@henderkes
Copy link
Copy Markdown
Contributor

Thank you for the contribution. Interesting idea, but I'm thinking we should merge the approach with #1883. The kind of worker is the same, how they are started is but a detail.

@nicolas-grekas the Caddyfile setting should likely be per php_server, not a global setting.

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 7 times, most recently from ad71bfe to 05e9702 Compare March 17, 2026 08:03
@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 17, 2026

@AlliBalliBaba The use case isn't task offloading (HTTP->worker), but out-of-band reconfigurability (environment->worker->HTTP). Sidekicks observe external systems (Redis Sentinel failover, secret rotation, feature flag changes, etc.) and publish updated configuration that HTTP workers pick up on their next request; with per-request consistency guaranteed via $_SERVER injection. No polling, no TTLs, no redeployment.

Task offloading (what you describe) is a valid and complementary pattern, but it solves a different problem. The non-HTTP worker foundation here could support both.

@henderkes Agreed that the underlying non-HTTP worker type overlaps with #1883. The foundation (skip HTTP startup/shutdown, immediate readiness, cooperative shutdown) is the same. The difference is the API layer and the DX goals:

  • Minimal FrankenPHP config: a single sidekick_entrypoint in php_server(thanks for the idea). No need to declare individual workers in the Caddyfile. The PHP app controls which sidekicks to start via frankenphp_sidekick_start(), keeping the infrastructure config simple.

  • Graceful degradability: apps should work correctly with or without FrankenPHP. The same codebase should work on FrankenPHP (with real-time reconfiguration) and on traditional setups (with static or always refreshed config).

  • Nice framework integration: the sidekick_entrypoint pointing to e.g. bin/console means sidekicks are regular framework commands, making them easy to develop.

Happy to follow up with your proposals now that this is hopefully clarified.
I'm going to continue on my own a bit also :)

@dunglas
Copy link
Copy Markdown
Member

dunglas commented Mar 17, 2026

Great PR!

Couldn't we create a single API that covers both use case?

We try to keep the number of public symbols and config option as small as possible!

@henderkes
Copy link
Copy Markdown
Contributor

@henderkes Agreed that the underlying non-HTTP worker type overlaps with #1883. The foundation (skip HTTP startup/shutdown, immediate readiness, cooperative shutdown) is the same. The difference is the API layer and the DX goals:

Yes, that's why I'd like to unify the two API's and background implementations into one. Unfortunately the first task worker attempt didn't make it into main, but perhaps @AlliBalliBaba can use his experience with the previous PR to influence this one. I'd be more in favour of a general API, than a specific sidecar one.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

The PHP-side API has been significantly reworked since the initial iteration: I replaced $_SERVER injection with explicit get_vars/set_vars protocol.

The old design used frankenphp_set_server_var() to inject values into $_SERVER implicitly. The new design uses an explicit request/response model:

  • frankenphp_sidekick_set_vars(array $vars): called from the sidekick to publish a complete snapshot atomically
  • frankenphp_sidekick_get_vars(string|array $name, float $timeout = 30.0): array: called from HTTP workers to read the latest vars

Key improvements:

  • No race condition on startup: get_vars blocks until the sidekick has called set_vars. The old design had a race where HTTP requests could arrive before the sidekick had published its values.
  • Strict context enforcement: set_vars and should_stop throw RuntimeException if called from a non-sidekick context.
  • Atomic snapshots: set_vars replaces all vars at once. No partial state possible
  • Parallel start: get_vars(['redis-watcher', 'feature-flags']) starts all sidekicks concurrently, waits for all, returns vars keyed by name.
  • Works in both worker and non-worker mode: get_vars works from any PHP script served by php_server, not just from frankenphp_handle_request() workers.

Other changes:

  • sidekick_entrypoint moved from global frankenphp block to per-php_server (as @henderkes suggested)
  • Removed the $argv parameter: the sidekick name is the command, passed as $_SERVER['argv'][1]
  • set_vars is restricted to sidekick context only (throws if called from HTTP workers)
  • get_vars accepts string|array: when given an array, all sidekicks start in parallel
  • Atomic snapshots: set_vars replaces all vars at once, no partial state
  • Binary-safe values (null bytes, UTF-8)

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 3 times, most recently from cb65f46 to 4dda455 Compare March 17, 2026 10:46
@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Thanks @dunglas and @henderkes for the feedback. I share the goal of keeping the API surface minimal.

Thinking about it more, the current API is actually quite small and already general:

  • 1 Caddyfile setting: sidekick_entrypoint (per php_server)
  • 3 PHP functions: get_vars, set_vars, should_stop

The name "sidekick" works as a generic concept: a helper running alongside. The current set_vars/get_vars protocol covers the config-publishing use case. For task offloading (HTTP->worker) later, the same sidekick infrastructure could support:

  • frankenphp_sidekick_send_task(string $name, mixed $payload): mixed
  • frankenphp_sidekick_receive_task(): mixed

Same worker type, same sidekick_entrypoint, same should_stop(). Just a different communication pattern added on top. No new config, no new worker type.

So the path would be:

  1. This PR: sidekicks with set_vars/get_vars (config publishing)
  2. Future PR: add send_task/receive_task (task offloading), reusing the same non-HTTP worker foundation

The foundation (non-HTTP threads, cooperative shutdown, crash recovery, per-php_server scoping) is shared. Only the communication primitives differ.

WDYT?

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 4 times, most recently from b3734f5 to ed79f46 Compare March 17, 2026 11:48
@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 17, 2026

I think the failures are unrelated - a cache reset would be needed. Any help on this topic?

@alexandre-daubois
Copy link
Copy Markdown
Member

alexandre-daubois commented Mar 17, 2026

Hmm, it seems they are on some versions, for example here: https://github.com/php/frankenphp/actions/runs/23192689128/job/67392820942?pr=2287#step:10:3614

For the cache, I'm not aware of a Github feature that allow to clear everything unfortunately 🙁

preparedEnvNeedsReplacement bool
logger *slog.Logger
requestOptions []frankenphp.RequestOption
backgroundScope string
Copy link
Copy Markdown
Contributor

@AlliBalliBaba AlliBalliBaba Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope can also just be an integer. It should probably be requested from and managed by the frankenphp package.

Copy link
Copy Markdown
Contributor Author

@nicolas-grekas nicolas-grekas Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I see benefits doing so. I'm leaving this as is, unless you feel strong about this?

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

AlliBalliBaba commented Mar 29, 2026

Would be nice if someone with fresh eyes could look over this PR. Short summary from my side:

  • the PR is still very big, there's probably a lot of room for simplification
  • the API is very exotic. Not necessarily bad, but it would be nice if we could have more general abstractions or a similar implementation for orientation somewhere else.
  • Still feels a bit messy to allow starting background workers at runtime. Is the main use case that libraries can start workers without the user configuring them?
  • Doing shutdown via stream is very clunky
  • Would be nice to have clean sync shutdown with signals, not sure this is possible rn
  • no integration caddy_tests
  • copying across threads is still very limited in what you can copy (something to be aware of in the future)

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

Another reason I dislike streams is that the polling api from (this RFC) might already be in PHP 8.6.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

Thanks for the review. Addressing each point:

PR size / simplification: I removed dead methods, moved scope management to the frankenphp package (NextBackgroundWorkerScope()), added a Caddy integration test (TestBackgroundWorkerCaddy). Happy to simplify further if you see specific spots.

API: set_vars/get_vars/get_signaling_stream - three functions, one Caddyfile directive. I'm not sure what a simpler abstraction would look like for cross-thread shared state with cooperative shutdown. The cooperative part is critical to me: background workers have fundamentally different execution models than HTTP ones.

Runtime worker starting: The main use case is libraries that ship their own background workers (e.g., a Redis Sentinel watcher package) without requiring users to list each one in the Caddyfile. The catch-all worker { background } enables this. Named workers with num 1 cover the explicit case. Both patterns are opt-in from the Caddyfile - PHP can't start workers that aren't declared.

Cross-thread copying: Intentionally limited to scalars/arrays/enums - this enables immutable array zero-copy, interned string optimization, and the === skip in set_vars. Objects would require serialize/unserialize which breaks SRP (foreign exceptions from the copy layer). For complex objects, Symfony's DeepCloner (symfony/symfony#63695) converts any serializable value to an array - perfect fit for set_vars. Explicit serialize() remains an option.

Stream-based shutdown: I explored signal-based shutdown extensively (pthread_kill, timer_create, CancelSynchronousIo). Go's signal trampoline makes this unviable cross-platform - documented in the PR.

On the "clunky" concern, it's worth noting that the upcoming PHP 8.6 poll API RFC validates this direction: it introduces Poll as a modern replacement for stream_select, with platform-optimized backends (epoll, kqueue, WSAPoll). Our signaling stream is forward-compatible: Poll::addReadable accepts stream resources directly. Code written today with stream_select will work on 8.6 with Poll - no API change needed on the FrankenPHP side. The stream approach isn't a stopgap, it's the foundation that the poll API builds on.

Example of forward compatibility - same code, both worlds:

// Works today (PHP 8.2+)
$stream = frankenphp_worker_get_signaling_stream();
stream_select([$stream], [], [], 5);
$signal = fgets($stream);

// Works on PHP 8.6+ with zero changes to FrankenPHP
$poll = new Poll();
$poll->addReadable($stream, fn() => match (fgets($stream)) {
    "stop\n" => false,
    "task\n" => handleTask(), // anticipating from the task-based API in #2319 
});
$poll->addTimer(5.0, fn() => frankenphp_worker_set_vars(refreshConfig()));
$poll->run();

Same $stream, no new FrankenPHP API needed. And if the poll RFC lands with a SignalHandle type, we could later add a higher-level alternative that hides the string parsing entirely (to be confirmed it's worth it in the future):

// Possible future API (PHP 8.6+, if SignalHandle lands)
$poll = new Poll();
$poll->addHandle(frankenphp_worker_get_poll_handle(), fn(string $signal) => match ($signal) {
    'stop' => false,
    'task' => handleTask(),
});
$poll->addTimer(5.0, fn() => frankenphp_worker_set_vars(refreshConfig()));
$poll->run();

The stream version stays for BC, same underlying pipe, different wrapper.

For simple cases today, the userland background_worker_should_stop() helper wraps the stream_select boilerplate into a one-liner.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

Protocol update: instead of using "stop\n" for signaling cooperative shutdown, closing the stream does the job. This means replacing what was "stop\n" === fgets($h) before by false === fgets($h). Cleaner.

I'm now leaning toward renaming functions:

  • frankenphp_get_worker_handle() instead of frankenphp_worker_get_signaling_stream()
  • frankenphp_get_worker_vars() instead of frankenphp_worker_get_vars()
  • frankenphp_set_worker_vars() instead of frankenphp_worker_set_vars()

WDYT?

@henderkes
Copy link
Copy Markdown
Contributor

henderkes commented Mar 30, 2026

Another reason I dislike streams is that the polling api from (this RFC) might already be in PHP 8.6.

@AlliBalliBaba If anything that's a positive because it's built upon streams and will just make the userland handling cleaner (which is my main objection for the stream api).

Protocol update: instead of using "stop\n" for signaling cooperative shutdown, closing the stream does the job. This means replacing what was "stop\n" === fgets($h) before by false === fgets($h). Cleaner.

Much better!

frankenphp_get_worker_handle() instead of frankenphp_worker_get_signaling_stream()

I like it better than get_signaling_stream. Especially because:

frankenphp_get_worker_vars() instead of frankenphp_worker_get_vars()
frankenphp_set_worker_vars() instead of frankenphp_worker_set_vars()

Is there actually a reason to not make this frankenphp_get_vars and frankenphp_set_vars? It's less specific, but that would allow expanding it to regular threads or requests later. If it's specific to a worker perhaps we could make that an (optional?) argument that accepts a worker handle?

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

Thanks for the push in that direction :)

@dunglas asked me why only arrays as "vars"? Suggesting we should accept strings and other scalars.

This made me change the API this way. Still 3 functions: (reverted)

  • frankenphp_set_worker_state()
  • frankenphp_get_worker_state()
  • frankenphp_get_worker_handle()

About an HTTP-worker state, the purpose would be to expose a thread-safe in-memory KV store to sync them?
What about defining a convention like: http-workers in the empty string $name?

@henderkes
Copy link
Copy Markdown
Contributor

I don't like that naming at all. It's too specific because it implies getting the state of a specific worker, which isn't at all the intention of the calling code. It we treat it more as a thread-safe, shared KV store we could want values for a specific topic, regardless of whether background workers, other http workers or even just regular requests set these variables before. I really liked get_vars and set_vars for that reason, with a topic to get the vars from, perhaps with an optional worker handle if really desired.

(Though frankenphp_get_worker_handle would then have to return an actual handle, not only a stream.)

Limiting this PR to only background workers is good, but we shouldn't make the interface so specific that it wouldn't make sense for other related features that we could add in the future.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Agreed the _worker_state naming was a step too far, thanks for the arguments. Going back to vars:

  • frankenphp_set_worker_vars(): void
  • frankenphp_get_worker_vars(): array
  • frankenphp_get_worker_handle(): resource

The $name parameter maps to worker names intentionally, that's what enables lazy-start. Decouple the name from workers and you lose that guarantee: someone would have to start workers explicitly, pushing orchestration complexity to userland.

About non-array values, I'm also pushing back, with one core reason: contract evolution. When state is an array, the producer can add keys without breaking any consumer, that's a non-breaking change by definition. This makes things so much easier to evolve without limiting anything in practice. Same reason HTTP APIs return JSON objects rather than bare values. The overhead is one hash table, the forward-compatibility guarantee is significant.

I'm now exploring adding get_vars(null) to access the http-workers' scope.

@henderkes
Copy link
Copy Markdown
Contributor

Agreed the _worker_state naming was a step too far, thanks for the arguments. Going back to vars:

  • frankenphp_set_worker_vars(): void
  • frankenphp_get_worker_vars(): array
  • frankenphp_get_worker_handle(): resource

The $name parameter maps to worker names intentionally, that's what enables lazy-start. Decouple the name from workers and you lose that guarantee: someone would have to start workers explicitly, pushing orchestration complexity to userland.

With the $name parameter already in place, just do frankenphp_set_vars and frankenphp_get_vars. Specifying "worker" in the function name here has no meaning when we already specify the worker name in the parameter. I think the guarantee that they're set up is great for background workers, but I don't yet know (or want to think about) whether we'll need it for future use cases.

To elaborate a bit:

/**
 * @param ?string The name of the scope you wish to access variables from. In case of a lazy-loaded background worker scope, this starts the background worker.
function frankenphp_get_vars(?string $name): array;

We can add a more powerful concepts of scopes down the line, let regular threads set vars, anything we want really, with only a change to the docblock. While keeping every addition completely background compatible. If we call them frankenphp_get_worker_vars we're limited to only workers. No topics, no future scopes, no future regular threads. If we want to add anything down the line, we can't break BC, so we need to add an extra function that needs nearly the exact same API, or we live with an API name that no longer makes sense.

About non-array values, I'm also pushing back, with one core reason: contract evolution. When state is an array, the producer can add keys without breaking any consumer, that's a non-breaking change by definition. This makes things so much easier to evolve without limiting anything in practice. Same reason HTTP APIs return JSON objects rather than bare values. The overhead is one hash table, the forward-compatibility guarantee is significant.

+1, I think array only already makes sense purely by naming. It's get_vars after all.

I'm now exploring adding get_vars(null) to access the http-workers' scope.

Different PR, please. This current one is already so massive, it's hard to continuously review the code before everyone here is happy with it in concept. Right now my focus is on making sure we don't make limiting decisions for things we can't foresee right now. Apart from that, one step at a time.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

Works for me, PR updated:

  • frankenphp_set_vars(): void
  • frankenphp_get_vars(): array
  • frankenphp_get_worker_handle(): resource

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

$scope instead of $name?
You suggested $topic but I'm not sure about it.

@henderkes
Copy link
Copy Markdown
Contributor

Ah right, that's part of BC too because people could explicitly pass the parameter by name.

No idea, I think $name is fine, though not as self-explanatory. I stole $topic from mercure because it's really similar to what would happen in frankenphp.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Another idea might be $namespace.

@dbu
Copy link
Copy Markdown
Contributor

dbu commented Mar 30, 2026

No idea, I think $name is fine, though not as self-explanatory. I stole $topic from mercure because it's really similar to what would happen in frankenphp.

set_vars sets a single variable, and get_vars gets one or more variables. i think name is good to identify that (especially now that the variable can be a scalar and does not need to be a hashmap array).

thinking about this, i wonder why we call the method set_vars? we can only set a single variable.
getting one "name" also returns us just that variable as it was set. getting multiple gives a hashmap. from the point of view of clear interface, it would be more clear with frankenphp_set_var. and two methods frankenphp_get_var(string) and frankenphp_get_vars(array) to have clear naming and avoid the different return types based on the argument type.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

"vars"-plural refers to the argument being a collections of value wrapped in an array. We're trying to have the minimum API surface, so not sure I'd add one more function.

@henderkes
Copy link
Copy Markdown
Contributor

set_vars sets a single variable

It should be setting a number of variables (php array) for a scope.

Though it should arguably be

frankenphp_set_vars(?string $name, array $vars): void;

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Though it should arguably be frankenphp_set_vars(?string $name, array $vars): void;

Not sure about this: one scope one name, part of the context. Adding a name would require the scope knowing its name, and would allow a scope to alter another scope. Not good either, this makes things way less bounded - aka more fragile to build on IMHO.

@henderkes
Copy link
Copy Markdown
Contributor

Though it should arguably be frankenphp_set_vars(?string $name, array $vars): void;

Not sure about this: one scope one name, part of the context. Adding a name would require the scope knowing its name, and would allow a scope to alter another scope. Not good either, this makes things way less bounded - aka more fragile to build on IMHO.

Shouldn't a background worker always knows it (application side) name, since it's given by other application code? If frankenphp_set_vars can't declare a scope things will become very messy if we try to expand it to http workers, queue workers or regular threads. Even though it also introduces a difficulty in coordinating that a background worker doesn't overwrite anothers variables. I'll need to sleep about it.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

By trying to make the new API do everything, we might just fail everything...
The current feature scope is really well bounded and solid.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

Maybe we tried to hard to turn this into a global KV store and should get back to frankenphp_set/get_worker_vars(). This is a known feature-scope. The other ambitions are not clearly-enough undefined and not to be addressed in this PR IMHO.

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

Same $stream, no new FrankenPHP API needed. And if the poll RFC lands with a SignalHandle type, we could later add a higher-level alternative that hides the string parsing entirely (to be confirmed it's worth it in the future):

Yeah I'm talking about abstracting this via handles in the future. But I guess you're right that it would be 8.6+ only.

Maybe we tried to hard to turn this into a global KV store and should get back to frankenphp_set/get_worker_vars()

I actually kind of like that this is a key-value store. If the endgoal is to allow libraries to start their own background processes, wouldn't it be better to do something like this?

// In library
$redisEndpoints = frankenphp_get_vars('redis:redis-host');

if(!$redisEndpoints){
    frankenphp_start_background_worker('/path/to/background-worker.php', [
        'host' => 'redis-host'
    ]);
}

$redisEndpoints = frankenphp_get_vars('redis');

Starting the background worker simply blocks until it has reached 'ready' by calling set_vars. Would make this more generic, WDYT @nicolas-grekas @henderkes

@johanjanssens
Copy link
Copy Markdown

@AlliBalliBaba pointed me to this PR: #2306 (comment) Reading the discussion it seems there is agreement that the shared state layer is essentially a process-wide key-value store. Took the time over the weekend to build one as a proof of concept, which I think offers some interesting benefits over the approach proposed in this PR.

You can find the code here: https://github.com/johanjanssens/frankenstate. It implements a cross-thread shared state as a standalone FrankenPHP extension, a SharedArray backed by Go (sync.RWMutex + map[string]any), exposed to PHP via ArrayAccess, and accessible over Redis wire protocol (RESP) on port 6380.

$state = new FrankenPHP\SharedArray();
$state['feature_flags'] = ['new_ui' => true];
$flags = $state['feature_flags'];

Key differences from the set_vars/get_vars approach:

  • Go-native: Go extensions write directly to the store (state.Set(k, v)), no CGO crossing on the write side. PHP reads via cached snapshots with version-gated refresh.
  • No new thread types: Any thread (Go or PHP) can read or write at any time. No dedicated background workers, no lifecycle management, no shutdown signaling.
  • No Caddyfile config: It's a PHP extension, not an infrastructure concern.
  • Redis protocol: The store speaks RESP, any process that talks Redis can push data in. redis-cli -p 6380 SET feature_flags '{"new_ui":true}' just works.

With this approach the "config watcher" flow changes to:

FrankenState: External system → any process (cron, script, CI/CD) → RESP → SharedArray → all PHP threads

The store is a KV with a standard protocol. The sync logic lives wherever it makes sense. A cron job, a deploy script, a Redis subscriber, whatever pushes data in over the Redis protocol, or whatever Go code uses the API. FrankenPHP doesn't need to know or care where the data comes from.

Might be worth considering as an alternative approach to the shared state part of this PR. With a standalone shared state store, the background worker and signaling infrastructure in this PR may not be needed, or could be developed as a more generic solution decoupled from state management. Happy to develop it further if there is interest.

Do not want to hijack this PR. Created a discussion thread here: #2324.

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

To truly benefit from ZTS, you'll have to do what's done in this PR and leave the copy mechanism on the C side. Couple that with potential 0-copy if the value hasn't changed and performance is potentially orders of magnitude better.

I still think we need a fast built-in cross-thread copy mechanism like was done in this PR, just think that the API could be more generic.

@henderkes
Copy link
Copy Markdown
Contributor

I still think we need a fast built-in cross-thread copy mechanism like was done in this PR, just think that the API could be more generic.

Essentially my only remaining issue here.

Maybe we tried to hard to turn this into a global KV store and should get back to frankenphp_set/get_worker_vars(). This is a known feature-scope. The other ambitions are not clearly-enough undefined and not to be addressed in this PR IMHO.

I feel like it would be great if background workers could use this more general KV store as a backend to push/pull values back to http threads that started the background workers. I'm really sorry for being so nitpicky here because it's a great addition, I'm just focussed on making sure the decisions made here won't bite us later.

I'll give the code a proper review in a few days again.

@dbu
Copy link
Copy Markdown
Contributor

dbu commented Apr 1, 2026

"vars"-plural refers to the argument being a collections of value wrapped in an array. We're trying to have the minimum API surface, so not sure I'd add one more function.

@nicolas-grekas i thought you widened the argument to also accept single string or other scalar value. or have you reverted that? if its reverted, i retire my suggestion :-)

regarding the general KV store vs workers: from my understanding, the intention from nikolas was that workers could be automatically launched on demand when a namespace is requested. maybe an approach could be:

  • generic set KV store
  • generic get KV store
  • specific get worker values (which is what nicolas proposed, but after making sure the worker is up accesses the underlying generic KV store)
  • specific worker signal stream

setting values in the worker could use the generic set KV store method. except if the worker get method needs to be aware when the value is set, maybe we'd need a specific worker set value method to unlock the blocking call.

but this would separate the handling of on-demand workers from the underlying KV mechanism. alternatively, we could have only the worker part and later refactor to a generic KV mechanism...

@henderkes
Copy link
Copy Markdown
Contributor

henderkes commented Apr 1, 2026

  • specific get worker values (which is what nicolas proposed, but after making sure the worker is up accesses the underlying generic KV store)

Then we have an extra function for essentially the same thing, which is what I want to avoid.

I'm closely aligned with @AlliBalliBaba's vision here. Similar to what he proposed:

// In library
$redisEndpoints = frankenphp_get_vars('redis:redis-host');

if(!$redisEndpoints){
    frankenphp_start_background_worker('/path/to/background-worker.php', [
        'host' => 'redis-host'
    ]);
}

$redisEndpoints = frankenphp_get_vars('redis');

Except that I like @nicolas-grekas concept of at-most-one worker, which would simplify this to:

# in library
frankenphp_start_background_worker("projectDir/path/to/redis-updater.php", scopes: ['redis-host'], args: []);
$redisEndpoints = frankenphp_get_vars('redis-host');

# in redis-updater
frankenphp_set_vars($scopes, ['host' => 'new-redis-host']);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants