Skip to content

Conversation

@YoYoJa
Copy link
Contributor

@YoYoJa YoYoJa commented Jan 29, 2026

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@YoYoJa YoYoJa requested a review from a team as a code owner January 29, 2026 11:40
Copilot AI review requested due to automatic review settings January 29, 2026 11:40
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Jan 29, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modifies the evaluation result conversion logic to move error status information from sample.output_status to the sample.error field in the AOAI output format when agent or model responses fail to generate.

Changes:

  • Modified _convert_single_row_to_aoai_format to check for error status in input data and populate the top-level sample field with error information
  • Updated test data files to include sample.output_status fields with both success and error scenarios
  • Changed behavior so that successful evaluations have sample: null instead of detailed sample information

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Added logic to extract error status from input data and populate sample.error field; removed code that populated sample from criteria metrics
sdk/evaluation/azure-ai-evaluation/tests/unittests/data/evaluation_util_convert_old_output_test.jsonl Updated test data to include sample.output_status fields for both success and error cases
sdk/evaluation/azure-ai-evaluation/tests/unittests/data/evaluation_util_convert_expected_output.json Updated expected output to show sample: null for success cases and sample.error for error cases

Comment on lines 2358 to 2362
if (input_data is not None
and "sample.output_status.status" in input_data
and isinstance(input_data["sample.output_status.status"], str)
and not _is_none_or_nan(input_data["sample.output_status.status"])
and input_data["sample.output_status.status"] != "success"):
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error checking logic has unnecessary parentheses wrapping the entire condition starting on line 2358. While syntactically correct, the outer parentheses spanning lines 2358-2362 make the code harder to read. Consider removing them or restructuring for better readability.

Suggested change
if (input_data is not None
and "sample.output_status.status" in input_data
and isinstance(input_data["sample.output_status.status"], str)
and not _is_none_or_nan(input_data["sample.output_status.status"])
and input_data["sample.output_status.status"] != "success"):
if input_data is not None \
and "sample.output_status.status" in input_data \
and isinstance(input_data["sample.output_status.status"], str) \
and not _is_none_or_nan(input_data["sample.output_status.status"]) \
and input_data["sample.output_status.status"] != "success":

Copilot uses AI. Check for mistakes.
@@ -2354,14 +2354,35 @@ def _convert_single_row_to_aoai_format(
run_output_results = []
top_sample = {}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialization top_sample = {} on line 2355 is immediately overwritten by the logic on lines 2358-2372 (either setting it to {"error": error_info} or None). This initialization serves no purpose and could be removed. Consider initializing it as top_sample = None instead for clarity, or remove the initialization entirely since it's set in all branches.

Suggested change
top_sample = {}

Copilot uses AI. Check for mistakes.
@YoYoJa YoYoJa changed the title Move sample.output_status to output_item.sample.error if agent/model response genarated failed Move duplicate sample.output_status to output_item.sample.error if agent/model response genarated failed Jan 29, 2026
YoYoJa and others added 2 commits January 29, 2026 12:09
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants