Update cost-tracking for OAI chatcompletions and response API#260
Merged
amanjaiswal73892 merged 2 commits intomainfrom Jul 9, 2025
Merged
Update cost-tracking for OAI chatcompletions and response API#260amanjaiswal73892 merged 2 commits intomainfrom
amanjaiswal73892 merged 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
| Category | Issue | Status |
|---|---|---|
| Incomplete API type detection logic ▹ view | ✅ Fix detected | |
| Inconsistent dictionary key names causing AttributeError ▹ view | ✅ Fix detected | |
| Missing Response Context in Log ▹ view | ||
| Redundant Token Extraction Logic ▹ view |
Files scanned
| File Path | Reviewed |
|---|---|
| src/agentlab/llm/tracking.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
src/agentlab/llm/tracking.py
Outdated
Comment on lines
166
to
167
| if 'prompt_tokens_details' in usage: | ||
| usage['cached_tokens'] = usage['prompt_token_details'].cached_tokens |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
src/agentlab/llm/tracking.py
Outdated
| if usage is None: | ||
| logging.warning("No usage information found in the response. Defaulting cost to 0.0.") | ||
| return 0.0 | ||
| api_type = 'chatcompletion' if hasattr(usage, "prompt_tokens_details") else 'response' |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
src/agentlab/llm/tracking.py
Outdated
Comment on lines
310
to
319
| if api_type == 'chatcompletion': | ||
| total_input_tokens = usage.prompt_tokens | ||
| output_tokens = usage.completion_tokens | ||
| cached_input_tokens = usage.prompt_tokens_details.cached_tokens | ||
| non_cached_input_tokens = total_input_tokens - cached_input_tokens | ||
| elif api_type == 'response': | ||
| total_input_tokens = usage.input_tokens | ||
| output_tokens = usage.output_tokens | ||
| cached_input_tokens = usage.input_tokens_details.cached_tokens | ||
| non_cached_input_tokens = total_input_tokens - cached_input_tokens |
There was a problem hiding this comment.
Redundant Token Extraction Logic 
Tell me more
What is the issue?
Duplicated token extraction logic with only attribute names differing between API types.
Why this matters
The duplicated structure makes the code harder to maintain and obscures the fact that both branches follow the same pattern.
Suggested change ∙ Feature Preview
TOKEN_MAPPINGS = {
'chatcompletion': {
'total_tokens': 'prompt_tokens',
'output_tokens': 'completion_tokens',
'details_attr': 'prompt_tokens_details'
},
'response': {
'total_tokens': 'input_tokens',
'output_tokens': 'output_tokens',
'details_attr': 'input_tokens_details'
}
}
mapping = TOKEN_MAPPINGS.get(api_type)
if mapping:
total_input_tokens = getattr(usage, mapping['total_tokens'])
output_tokens = getattr(usage, mapping['output_tokens'])
details = getattr(usage, mapping['details_attr'])
cached_input_tokens = details.cached_tokens
non_cached_input_tokens = total_input_tokens - cached_input_tokensProvide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
recursix
approved these changes
Jul 9, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request refines token usage tracking and cost calculation logic in the
src/agentlab/llm/tracking.pyfile. The changes improve handling of cached tokens and introduce better differentiation between API types (chatcompletionvs.response) for more accurate cost computation.Enhancements to token usage tracking:
src/agentlab/llm/tracking.py: Updated the__call__method to extract and includecached_tokensfromprompt_tokens_detailsorinput_tokens_detailsin the usage dictionary, ensuring more precise tracking of cached token usage.Improvements to cost calculation logic:
src/agentlab/llm/tracking.py: Refactored theget_effective_cost_from_openai_apimethod to handle two distinct API types (chatcompletionandresponse). Added logic to computecached_input_tokensandnon_cached_input_tokensseparately for each type, improving the accuracy of effective cost calculations. Introduced warnings for missing usage information or unsupported API types.…ponses APIDescription by Korbit AI
What change is being made?
Update cost-tracking logic for OpenAI chatcompletion and response APIs to account for token caching and improve cost calculation accuracy.
Why are these changes being made?
The change introduces handling for
prompt_tokens_detailsandinput_tokens_detailsto accurately assess cached tokens, addressing previous shortcomings in cost estimation by accounting for cached and non-cached token differences. This ensures more precise cost tracking and provides fallback handling for missing API usage information, enhancing robustness and logging unsupported API types to prevent miscalculations.