Report Generation: Adding notebooks to better explain the usage by lotif · Pull Request #61 · VectorInstitute/eval-agents

lotif · 2026-02-19T15:16:32Z

Summary

Adding notebooks to better explain the usage and features of the Report Generation Agent. Also performing minor bug fixes.

Clickup Ticket(s): NA

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📝 Documentation update
🔧 Refactoring (no functional changes)
⚡ Performance improvement
🧪 Test improvements
🔒 Security fix

Changes Made

Adding three notebooks to:

Download and import the dataset into an SQLite DB
Run the agent with online evaluations
Run the offline evaluations

Also making a few adjustments and fixing a few bugs found while making the notebooks.

Testing

Tests pass locally (uv run pytest tests/)
Type checking passes (uv run mypy <src_dir>)
Linting passes (uv run ruff check src_dir/)
Manual testing performed (describe below)

Manual testing details:

Ran the notebooks and made sure they work.

Checklist

Code follows the project's style guidelines
Self-review of code completed
Documentation updated (if applicable)
No sensitive information (API keys, credentials) exposed

amrit110

Just the one comment. The code itself looks good!

amrit110 · 2026-02-19T15:27:20Z

implementations/report_generation/01_Importing_the_Dataset.ipynb

I'm generally not in favour of committing the outputs of cells. It just adds a lot more clutter to the git history, and usually it can change between runs as well. So consider clearing the outputs and only commit the code.

…notebooks

amrit110

Just minor typos. Actually you could try adding this pre-commit hook and run ot once for all the notebooks:

https://github.com/VectorInstitute/aieng-template-uv/blob/main/.pre-commit-config.yaml#L45C1-L49C17

amrit110 · 2026-02-19T18:36:14Z

implementations/report_generation/03_Running_Offline_Evaluations.ipynb

+    "\n",
+    "Offline evaluations are evaluations run against a **pre-defined dataset**. It performs **detailed evaluations** of the **outputs** of the agentic system and the **steps** it has taken to produce those evaluations.\n",
+    "\n",
+    "This dataset is called the **expected results** or the **ground-truth** dataset, and on this case it's a **handcrafted** dataset with **inputs, oputputs and trajectory** for a few known use cases.\n",


typo: outputs

amrit110 · 2026-02-19T18:40:30Z

implementations/report_generation/03_Running_Offline_Evaluations.ipynb

+   "source": [
+    "## Running the Evaluations\n",
+    "\n",
+    "To run those two evaluatoirs against all of the ground-truth dataset samples, run the function below:"


typo: evaluator

lotif · 2026-02-19T20:47:35Z

@amrit110 the precommit didn't work very well. It's picking up certifi (the package) as a typo and not looking into the notebooks. I'm gonna skip adding it for now.

lotif added 5 commits February 17, 2026 13:50

Halfway through the first notebook

53fd42e

Merge branch 'main' into marcelo/notebooks

41766a2

Finishing the first notebook, adding the second notebook

9e239a7

Finishing up the notebooks

6a5d8ca

Small adjustments to the notebooks

38ecb4b

lotif requested review from amrit110 and fcogidi February 19, 2026 15:16

amrit110 requested changes Feb 19, 2026

View reviewed changes

amrit110 and others added 3 commits February 19, 2026 10:28

Merge branch 'main' into marcelo/notebooks

87a38bc

Some other small improvements

d0724f8

Merge remote-tracking branch 'origin/marcelo/notebooks' into marcelo/…

49a6832

…notebooks

amrit110 approved these changes Feb 19, 2026

View reviewed changes

CR by Amrit

3122783

lotif merged commit ad70681 into main Feb 19, 2026
3 checks passed

lotif deleted the marcelo/notebooks branch February 19, 2026 20:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Report Generation: Adding notebooks to better explain the usage#61

Report Generation: Adding notebooks to better explain the usage#61
lotif merged 9 commits intomainfrom
marcelo/notebooks

lotif commented Feb 19, 2026 •

edited

Loading

Uh oh!

amrit110 left a comment

Uh oh!

amrit110 Feb 19, 2026

Uh oh!

amrit110 left a comment

Uh oh!

amrit110 Feb 19, 2026

Uh oh!

amrit110 Feb 19, 2026

Uh oh!

lotif commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

lotif commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

Changes Made

Testing

Checklist

Uh oh!

amrit110 left a comment

Choose a reason for hiding this comment

Uh oh!

amrit110 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

amrit110 left a comment

Choose a reason for hiding this comment

Uh oh!

amrit110 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

amrit110 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

lotif commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lotif commented Feb 19, 2026 •

edited

Loading