docs: add correctness, goal success rate, coherence evaluator examples#714
Conversation
ef9aace to
df060f3
Compare
df060f3 to
c0a9fbe
Compare
c0a9fbe to
2ac192c
Compare
poshinchen
left a comment
There was a problem hiding this comment.
Hmm can we align the terminology? I don't want to mix reference and asseritons.
2ac192c to
7abee56
Compare
|
Assessment: Comment The examples follow the established patterns in the Review Categories
The new examples are a solid addition to the evals documentation suite. |
Documentation Preview ReadyYour documentation preview has been successfully deployed! Preview URL: https://d3ehv1nix5p99z.cloudfront.net/pr-cms-714/docs/user-guide/quickstart/overview/ Updated at: 2026-04-16T19:16:45.863Z |
Co-authored-by: Kang Zhou <kangzhou1991@gmail.com> Co-authored-by: Subramanian Chidambaram <subbu10123@gmail.com>
7abee56 to
99a9c7c
Compare
Addressing Review FeedbackI've made the following changes based on the automated review: 1. Differentiated Goal Success Rate Assertion ExamplesReplaced the identical math test cases in
These scenarios emphasize goal achievement (did the agent complete the task?) rather than factual correctness (is the answer right?), which clarifies the distinction between the two evaluators for users. 2. Grammar FixFixed pre-existing typo in 3. Completeness - "With Reference" ModeThe existing files already cover both modes:
No additional example file is needed. |
|
Assessment: Approve The prior review feedback has been mostly addressed — the grammar fix ("an experiment") is applied in all 3 new files that had the issue, and the "with reference" mode clarification makes sense. The examples are consistent with the established patterns in the Outstanding Suggestion
Clean, well-structured additions to the evals examples. 👍 |
Description
Add examples for:
Related Issues
strands-agents/evals#95
Type of Change
Checklist
npm run devBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.