feat: adds reporting for cost and latency optimization failures by andrewklatzke · Pull Request #180 · launchdarkly/python-server-sdk-ai

andrewklatzke · 2026-05-07T22:07:38Z

Requirements

I have added test coverage for new or changed functionality
I have followed the repository's pull request submission guidelines
I have validated my changes against all supported platform versions

Describe the solution you've provided

This is intended to demystify some of the results we're receiving from the optimization package - namely:

Total token counts are now accrued and reported with each result so that we can see if a user crosses the total allowed tokens threshold
Score results are reported for cost or latency if they're being optimized against as an item in the score result so that it can be shown on the UI
Finally, if quality has already met the required threshold the prompt now contains instructions to optimize only against cost (if cost is being optimized against)

Describe alternatives you've considered

This is in some ways a bug fix since this information wasn't clear to the user as to what was causing the failure. Technically additional feature/functionality but likely required to express the required information to make it actionable for the user.

Additional context

Cost and latency are only optimized for/include scores if they trigger the keywords that would lead to them being optimized. "Base" implementations without these features being used are unaffected.

Note

Medium Risk
Changes optimization pass/fail logic and persisted result payloads (new gate scores, baseline handling, token-budget semantics), which could affect when runs succeed/fail and what the UI/API receives.

Overview
Improves optimization run reporting by tracking and persisting a single accumulated_token_usage total across agent, judge, and variation calls, and including it in result PATCH payloads (extending generationTokens to allow accumulated_total).

Refactors latency/cost optimization to use explicit baseline values (not history[0]), caps history growth (_trim_history) for both standard and ground-truth flows, and adds synthetic _latency_gate/_cost_gate score entries so gate failures are visible in results.

Adjusts run control flow so pass/fail is evaluated before token-limit checks (including GT batches and validation), and updates variation prompting to focus purely on cost reduction when quality is already passing; also relaxes the cost gate tolerance from 20% to 10% improvement and expands tests accordingly.

^{Reviewed by Cursor Bugbot for commit 365fa94. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Cursor Bugbot has reviewed your changes using default mode and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 365fa94. Configure here.}

cursor · 2026-05-07T22:13:23Z

+        """
+        if not self._history or not self._options.judges:
+            return False
+        recent = self._history[-1]


GT optimizer incorrectly reports quality as passing

Medium Severity

_all_judges_passing only inspects self._history[-1] (the last entry), but in the ground-truth optimizer all N sample results from a failed attempt are extended into history before _generate_new_variation is called. If the last sample's judges happened to pass while an earlier sample's judges failed, this method incorrectly returns True. The variation prompt then tells the LLM to "preserve existing behavior and only reduce cost," preventing it from addressing the quality failures in other samples.

Additional Locations (1)

packages/optimization/src/ldai_optimizer/client.py#L1264-L1271

^{Reviewed by Cursor Bugbot for commit 365fa94. Configure here.}

feat: adds reporting for cost and latency optimization failures

365fa94

andrewklatzke requested a review from a team as a code owner May 7, 2026 22:07

cursor Bot reviewed May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adds reporting for cost and latency optimization failures#180

feat: adds reporting for cost and latency optimization failures#180
andrewklatzke wants to merge 1 commit intoaklatzke/AIC-2465/cost-optimizationfrom
aklatzke/AIC-2474/report-cost-latency-failures

andrewklatzke commented May 7, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andrewklatzke commented May 7, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 7, 2026

Choose a reason for hiding this comment

GT optimizer incorrectly reports quality as passing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

andrewklatzke commented May 7, 2026 •

edited by cursor Bot

Loading