Skip to content

feat: Implement optimization code paths and functionality for initial release#140

Open
andrewklatzke wants to merge 45 commits intomainfrom
aklatzke/AIC-2263/sdk-dx-improvements
Open

feat: Implement optimization code paths and functionality for initial release#140
andrewklatzke wants to merge 45 commits intomainfrom
aklatzke/AIC-2263/sdk-dx-improvements

Conversation

@andrewklatzke
Copy link
Copy Markdown
Contributor

@andrewklatzke andrewklatzke commented Apr 17, 2026

Requirements

  • I have added test coverage for new or changed functionality
  • I have followed the repository's pull request submission guidelines
  • I have validated my changes against all supported platform versions

Related issues

This PR encapsulates all previous changes in the chain of optimization PRs that were broken up into smaller pieces. Consolidating here so that we can have a single commit/release of the package. The PRs were independently reviewed and approved.

Describe the solution you've provided

See:

#116
#117
#119
#122
#127
#128
#130
#135
#139


Note

High Risk
Large new implementation that orchestrates LLM-driven prompt iteration and performs authenticated LaunchDarkly REST API writes (result persistence and variation creation), so failures could impact optimization data and published configurations.

Overview
Implements the initial functional ldai_optimizer package, replacing the prior ldai_optimization scaffold and renaming the distribution/paths accordingly.

Adds a full OptimizationClient that runs iterative optimization loops (agent calls, judge evaluation, variation generation with retries/validation, token/duration tracking) and supports both option-based runs and API config–driven runs with live result persistence.

Introduces an internal LaunchDarkly REST API client (LDApiClient) with retries/redaction plus utilities/prompts/dataclasses to support structured judge/variation I/O, variable placeholder safety, and optional auto-commit of winning variations back to LaunchDarkly.

Reviewed by Cursor Bugbot for commit 92f51fa. Bugbot is set up for automated code reviews on this repo. Configure here.

…ype, remove required context_choices argument and default to anon
Comment thread packages/optimization/src/ldai_optimizer/util.py

# variation() returns the raw JSON before chevron.render(), so instructions
# still contain {{placeholder}} tokens rather than empty strings.
raw_variation = self._ldClient._client.variation(agent_key, context, {})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker but I would be careful relying on the hidden property. Maybe this is something we need to expose in the public api?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really "hidden" in the sense that the user shouldn't be able to access it if they're determined. It's just a property that's only really used internally. We're using this to re-derive the variables that were present in the initial variation so that the LLM has a stable reference when trying to replace them

Comment on lines +331 to +333
Errors are caught and logged rather than raised so that persistence
failures never abort an in-progress optimization run.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this handle things like backof/throttle/retry in the event of 429? Maybe aborting an in-progress optimization is best in cases of permanent errors (eg 401, indicates an invalid token), or even sporadic errors like 403, where certain requests fail, and others succeed, but it would lead to incorrect results?

Copy link
Copy Markdown
Contributor Author

@andrewklatzke andrewklatzke Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The api key gets checked up-front with the fetch for model configs etc. so shouldn't error unless it's revoked midway through. Given that, I'd say it makes sense to fail on any 401s here.

I'll add some retry logic for other status codes up to a max and then fail the optimization if we hit max retries

"current_parameters": {
"type": "object",
"description": "The improved agent parameters (e.g., temperature, max_tokens, etc.)",
"additionalProperties": True,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? Does it allow the LLM to emit arbitrary keys, which then get persisted on agent_optimization_result.parameters and, on auto_commit, pushed as the new AI Config variation's parameters? It looks like we only expect certain values here?

Maybe after extract_json_from_response, filter current_parameters to a known allow-list of LLM call parameters — e.g., {temperature, top_p, max_tokens, presence_penalty, frequency_penalty, stop} — and drop unknown keys with a warning log. This also belongs in the server-side validation on POST /ai-configs/{config_key}/variations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me look into this, it may only exist for posterity at this point (moved away from tool-based validation on these since some providers didn't play nicely with it).

In the case we need to keep it, we unfortunately cannot know what is valid in this parameters object given that the user provides their LLM call. Each provider's parameters differ so we could only allow-list if we knew up-front which provider they were using

"Failed to extract JSON from response. "
"Response length: %d, response: %s",
len(response_str),
response_str,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really safe to log, as it likely contains sensitive info

Comment thread packages/optimization/pyproject.toml Outdated
]
dependencies = [
"launchdarkly-server-sdk-ai>=0.16.0",
"coolname>=2.0.0",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? Why such an old version? Is having cool names for variation keys worth the risk of this dependency? Would the 'cool' names even be useful/meaningful, vs a default date-based name?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in review meeting; will just hand roll this as its pretty simple to implement

Comment on lines +270 to +271
Attempts direct JSON parsing first, then progressively falls back to
extracting JSON from markdown code blocks and balanced-brace scanning.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the flexibility here is neat, but it increases the risks of adversarial JSON smuggling.

Would it be possible to get metrics from how many times the fallback methods are used?

Maybe if a fallback method was used, we should check to see if there are unexpected keys in the resulting object, and WARN-log it, to hopefully alert the customer that something is amiss?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is basically just fallback parsing if the LLM decides to respond with something like:

Here is your JSON:
{...}

Unfortunately since we don't know which provider the user is using for this we can't enforce any structured output (for those that even support it).

I think it's worth emitting a warn on

Comment thread packages/optimization/src/ldai_optimizer/client.py Outdated
Comment thread packages/optimization/src/ldai_optimizer/util.py Outdated
Comment thread packages/optimization/src/ldai_optimizer/client.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 7468372. Configure here.

Comment thread packages/optimization/src/ldai_optimizer/dataclasses.py
Comment thread packages/optimization/src/ldai_optimizer/util.py
@andrewklatzke
Copy link
Copy Markdown
Contributor Author

andrewklatzke commented Apr 22, 2026

New commits added with security review results + cursor feedback on those changes:

8b3c69f
7468372

this includes:

  • Rename of package to ldai_optimizer
  • Changes target to the ldai_optimizer package for publishing (sec review - package already published)
  • Adds <untrusted> sentinels around user-supplied or LLM-generated data
  • Fills out the readme some, adds note about how API keys are used (sec review)
  • Adds a top level comment with the same data as the readme ^^ (sec review)
  • Adds a RedactionFilter which will scrub any keys from the loggers
  • Removes logging of the response in fail state

312161f

Includes:

  • Retry logic for posting results
  • Structural validation for the variation we assemble from the LLM response

3bce893

Includes:

  • Removed unused function that had additionalProperties: True

@andrewklatzke
Copy link
Copy Markdown
Contributor Author

e8c6692

Includes:

  • Adding token limit handling

@@ -1,7 +1,7 @@
[project]
name = "launchdarkly-server-sdk-ai-optimization"
name = "ldai_optimizer"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block on this but all other launchdarkly python packages have the launchdarkly-* name.

**Requirements**

- [x] I have added test coverage for new or changed functionality
- [x] I have followed the repository's [pull request submission
guidelines](../blob/main/CONTRIBUTING.md#submitting-pull-requests)
- [x] I have validated my changes against all supported platform
versions

**Describe the solution you've provided**

Implements handling for "inverted" judges.

**Describe alternatives you've considered**

This gets feature parity with our online evals functionality; no
alternatives considered.

**Additional context**

When a metric has `is_inverted` set, it's intended that the evaluation
of the score flips from `>=` to` <=`. This adds a util `_judge_passed`
to handle that logic and implements it throughout. We don't surface the
inverted property in the SDK, so we fetch the judge directly to get this
information.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Changes core judge pass/fail semantics and adds per-judge REST calls
(`get_ai_config`) during config-driven runs, which could affect
optimization outcomes and introduce new failure/performance modes if the
API is unavailable or slow.
> 
> **Overview**
> Adds first-class support for **inverted judges** (where *lower* scores
are better) by introducing a shared `judge_passed` helper and using it
for pass/fail decisions in `OptimizationClient` and in prompt feedback
generation.
> 
> Extends `OptimizationJudge` with an `is_inverted` flag and, for
`optimize_from_config`, fetches each judge’s `isInverted` value via
`api_client.get_ai_config` when building options. Updates logging to
include the inverted status, and adds targeted tests covering the
helper, mixed inverted/standard evaluation, config building behavior,
and `variation_prompt_feedback` output.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
a8f14de. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
**Requirements**

- [x] I have added test coverage for new or changed functionality
- [x] I have followed the repository's [pull request submission
guidelines](../blob/main/CONTRIBUTING.md#submitting-pull-requests)
- [x] I have validated my changes against all supported platform
versions

**Describe the solution you've provided**

Implements better handling for params on the initial variation (folds
model changes in as overwrites rather than completely replacing) and
ensures custom params are persisted unchanged. Additionally makes sure
that tools cannot be changed by the optimization process.

**Describe alternatives you've considered**

This is the result of a bug report so there weren't really alternatives
considered.

**Additional context**

Initially it was assumed that the optimization process would properly
pull params forward (via the LLM) but this doesn't seem to always be the
case. In the case of custom params, they aren't fed into the LLM calls
since they're user-specified data (not specific to the actual
optimization result). We now just pull these through as-is. In the case
of tools, the model will be able to optimize the prompt to call a
specific tool if multiple are provided, but we don't want to strip any
tool information from the final result as it may be necessary for the
calls to function.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Moderate risk because it changes how model parameters and `tools` are
carried forward across optimization iterations and what gets
auto-committed, which can affect runtime agent behavior if
merging/restoration logic is wrong.
> 
> **Overview**
> Improves variation-application logic so LLM-generated
`current_parameters` are **merged** into existing parameters instead of
replacing them, preserving user-specified/custom settings (e.g.
`max_tokens`, `response_format`) when the LLM omits them.
> 
> Prevents tool drift by always restoring the original `tools` list (and
logging when the LLM returns a different one) to avoid silently dropping
user tools or leaking internal framework tools.
> 
> Captures `model.custom` from the initial LaunchDarkly variation and
includes it when auto-committing a winning variation; adds focused test
coverage for parameter persistence, tool restoration/warnings, and
`model.custom` propagation.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
572a2aa. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants