Skip to content

feat(flagd): extract evaluator into api, core, and testkit packages#377

Open
aepfli wants to merge 20 commits intomainfrom
feat/extract-flagd-evaluator-api-core-testkit
Open

feat(flagd): extract evaluator into api, core, and testkit packages#377
aepfli wants to merge 20 commits intomainfrom
feat/extract-flagd-evaluator-api-core-testkit

Conversation

@aepfli
Copy link
Copy Markdown
Member

@aepfli aepfli commented Apr 8, 2026

Summary

Mirrors the Java SDK contrib architecture (PR #1696, PR #1742) by extracting the flagd evaluation logic into three independent packages:

  • openfeature-flagd-api (tools/openfeature-flagd-api/): Evaluator Protocol defining the contract for flag evaluation, so others can implement their own evaluator
  • openfeature-flagd-core (tools/openfeature-flagd-core/): Reference implementation (FlagdCore) with targeting engine and custom operators (fractional v2, sem_ver, starts_with, ends_with)
  • openfeature-flagd-api-testkit (tools/openfeature-flagd-api-testkit/): Compliance test suite bundling gherkin feature files from the test-harness evaluator/ directory, with pytest-bdd step definitions — installable as a package so custom evaluator implementations can run the same compliance suite

Provider refactoring

  • InProcessResolver now delegates evaluation to FlagdCore via an adapter pattern
  • Old modules (flags.py, targeting.py, custom_ops.py) are thin re-exports from core for backward compatibility
  • Connectors (FileWatcher, GrpcWatcher) remain unchanged
  • No changes to gRPC resolvers, config, or other provider functionality

Fractional bucketing

  • flagd-core implements the v2 fractional algorithm (unsigned hash, integer arithmetic with (hash * totalWeight) >> 32)
  • Includes MAX_WEIGHT_SUM overflow guard, negative weight clamping, explicit bool-as-weight rejection
  • v1 fractional tests are deselected since the implementation is v2

CI & release

  • Added tools/* packages to the build workflow matrix (lint, mypy, tests on Python 3.10–3.14)
  • Added py.typed marker files for PEP 561 compliance
  • Replaced project.scripts with poethepoet tasks to match CI conventions
  • Added release-please config for all 3 new packages
  • Added tools/* to UV workspace members

Other changes

  • Updated test-harness submodule to v3.5.0 (adds evaluator/ directory with gherkin feature files)
  • Dropped Python 3.9 support for tools packages (aligned with rest of project)
  • Schemas and spec submodules kept at main (no changes)

Test plan

  • openfeature-flagd-api unit tests: 10 passed
  • openfeature-flagd-api-testkit smoke tests: 2 passed, mypy clean
  • openfeature-flagd-core unit tests: 27 passed
  • openfeature-flagd-core e2e (testkit compliance): 85 passed, 15 deselected (fractional-v1), 0 failures
  • Provider unit tests: no regressions
  • Lint: ruff check + ruff format clean
  • Type checking: mypy strict clean for all 3 tools packages

How to use the testkit (for custom evaluator implementations)

# conftest.py
from openfeature.contrib.tools.flagd.testkit import load_testkit_flags
from openfeature.contrib.tools.flagd.testkit.steps import *  # noqa: F403

@pytest.fixture
def evaluator():
    core = MyCustomEvaluator()
    core.set_flags(load_testkit_flags())
    return core

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a modular architecture for the flagd provider by extracting core evaluation logic and API definitions into separate tools packages. It replaces the internal FlagStore with a new FlagdCore implementation and adds a testkit for compliance testing. I have no feedback to provide on the changes.

@aepfli aepfli force-pushed the feat/extract-flagd-evaluator-api-core-testkit branch from cd14cf4 to 4cd2612 Compare April 15, 2026 17:18
@github-actions github-actions bot requested a review from federicobond April 15, 2026 17:18
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 96.77419% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.09%. Comparing base (564eb68) to head (65f4052).

Files with missing lines Patch % Lines
...openfeature/contrib/tools/flagd/core/flagd_core.py 93.26% 7 Missing ⚠️
...trib/tools/flagd/testkit/steps/evaluation_steps.py 93.44% 4 Missing ⚠️
...openfeature/contrib/tools/flagd/core/model/flag.py 94.73% 3 Missing ⚠️
...ature/contrib/tools/flagd/core/model/flag_store.py 97.05% 1 Missing ⚠️
...e/contrib/tools/flagd/core/targeting/custom_ops.py 99.21% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #377      +/-   ##
==========================================
+ Coverage   95.91%   96.09%   +0.18%     
==========================================
  Files          30       42      +12     
  Lines        1517     1563      +46     
==========================================
+ Hits         1455     1502      +47     
+ Misses         62       61       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

aepfli and others added 2 commits April 15, 2026 19:25
Split the flagd evaluation logic from the provider into three
independent packages under tools/, mirroring the Java SDK contrib
architecture (PRs #1696 and #1742):

- openfeature-flagd-api: Evaluator Protocol defining the contract
  for flag evaluation implementations
- openfeature-flagd-core: Reference implementation with FlagdCore
  class, targeting engine, and custom operators (fractional, sem_ver,
  starts_with, ends_with)
- openfeature-flagd-api-testkit: Compliance test suite bundling
  gherkin feature files from the test-harness evaluator directory

The provider's InProcessResolver now delegates to FlagdCore via an
adapter pattern, keeping connector code (FileWatcher, GrpcWatcher)
unchanged. Old provider modules (flags.py, targeting.py, custom_ops.py)
are thin re-exports from the core package for backward compatibility.

Also updates the test-harness submodule from v2.11.1 to v3.5.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
- Implement fractional v2 bucketing algorithm (unsigned hash, integer
  arithmetic with bit-shift instead of percentage-based float)
- Add MAX_WEIGHT_SUM overflow guard
- Add negative weight clamping (max(0, weight))
- Add explicit bool-as-weight rejection
- Support non-string variant types (str|float|int|bool|None)
- Extract _resolve_bucket_by helper
- Bump mmh3 dependency to >=5.0.0,<6.0.0
- Drop Python 3.9: update requires-python to >=3.10 for all tools packages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
@aepfli aepfli force-pushed the feat/extract-flagd-evaluator-api-core-testkit branch from ad8727a to f34c06a Compare April 15, 2026 17:26
aepfli and others added 11 commits April 15, 2026 19:32
- Fix ruff violations: UP007 (modern type unions), N818 (rename
  FlagStoreException to FlagStoreError), FURB171 (simplify membership
  test), PERF401 (use list comprehension), S101 (allow assert in steps)
- Add py.typed marker files for PEP 561 compliance
- Revert protobuf evaluation.v2 imports/config back to v1
- Run ruff format on all affected files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
- Add tools/openfeature-flagd-{api,core,api-testkit} to build matrix
- Replace project.scripts with poe tasks to match CI expectations
- Add poethepoet dev dependency to all tools packages
- Remove obsolete scripts.py files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
flagd-core implements the v2 fractional bucketing algorithm, so v1
test expectations don't match. Deselect @fractional-v1 tagged tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The testkit is a test library, not a test suite. CI's `poe cov` failed
with "no data collected" because tests/ was empty. Add smoke tests to
verify the testkit can be imported and returns valid data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Fix mypy errors: add return types, parameter types, use ErrorCode enum
instead of string, and cast Mapping to dict for indexed assignment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
@aepfli aepfli marked this pull request as ready for review April 15, 2026 17:53
@aepfli aepfli requested review from a team as code owners April 15, 2026 17:53
@aepfli
Copy link
Copy Markdown
Member Author

aepfli commented Apr 15, 2026

/gemini review

@aepfli aepfli changed the title feat: extract flagd evaluator into api, core, and testkit packages feat(flagd): extract evaluator into api, core, and testkit packages Apr 15, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the openfeature-flagd-api and openfeature-flagd-core packages to provide a modular evaluator implementation for flagd. It also refactors the existing openfeature-provider-flagd to utilize these new core components. I have provided feedback to improve performance by allowing flag configuration to be passed as a dictionary, avoiding redundant serialization, and suggested fixes for regex escaping and exception handling in the core implementation.

Comment thread tools/openfeature-flagd-api/src/openfeature/contrib/tools/flagd/api/evaluator.py Outdated
- Accept str | dict in Evaluator protocol and FlagdCore, eliminating
  the dict->JSON->dict roundtrip in _FlagStoreAdapter
- Fix ReferenceError handler: use exception instance, not class, and
  log flag.targeting instead of the function object
- Escape evaluator names in $ref regex replacement (re.escape)
- Fix backward-compat FlagStore to emit changed_keys, not all keys
- Fix README import paths (was api.testkit, should be testkit)
- Add content to flagd-core CHANGELOG.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
@aepfli
Copy link
Copy Markdown
Member Author

aepfli commented Apr 15, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the flagd provider by extracting its core evaluation logic into three new standalone packages: openfeature-flagd-api (protocol definition), openfeature-flagd-core (reference implementation), and openfeature-flagd-api-testkit (compliance suite). The InProcessResolver has been updated to use the new FlagdCore evaluator, and existing modules have been refactored to re-export logic for backward compatibility. Review feedback highlights a potential breaking change in the FlagStore.update method signature and suggests a correction for JSON parsing in the testkit utilities.

Replace vendored feature/flag files with a hatch build hook that copies
them from the test-harness submodule's evaluator/ directory. Files are
gitignored and generated fresh on each build via force_include.

Also:
- Update test-harness submodule to v3.5.0 (adds @fractional-v1/v2 tags)
- Add fractional-v1 deselect to provider pytest.ini
- Remove redundant flags.py re-export from testkit
- Address review feedback (dict passthrough, ReferenceError fix, etc.)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
aepfli and others added 5 commits April 15, 2026 20:59
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The hatch build hook includes files in sdist/wheel via force_include,
but tests run from the source tree. Add a sync script that copies
files from the test-harness submodule, called by poe before test/cov.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
flagd-core e2e tests depend on testkit feature files which are
generated from the test-harness submodule, not checked in.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Explain why the replace('\"', '"') is needed — pytest-bdd preserves
backslash escapes from Gherkin table cells.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants