From 4b4773ca36e4f28b1985377240caf1b537ebd72f Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 5 Mar 2026 03:40:43 +0000 Subject: [PATCH] Replace first Challenge 5 question with off-topic elephant question Tests whether RAG system stays grounded when the query has nothing to do with the corpus topic. https://claude.ai/code/session_01Ev2pi7ijzrmjY8GW55PLnr --- episodes/07-Retrieval-augmented-generation.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/episodes/07-Retrieval-augmented-generation.md b/episodes/07-Retrieval-augmented-generation.md index f18f5df2..29d8ee02 100644 --- a/episodes/07-Retrieval-augmented-generation.md +++ b/episodes/07-Retrieval-augmented-generation.md @@ -370,8 +370,8 @@ Lower `top_k` gives Gemini a tighter, more focused context — good when the ans The quality of a RAG system depends heavily on the questions you ask. Try these queries — each tests a different aspect of retrieval and generation: ```python -# Broad factual question — answer should be well-supported by multiple papers -print(ask("How much energy does it cost to train a large language model?")) +# Off-topic question — not covered by the corpus at all +print(ask("How much does an elephant weight?")) print("\n" + "="*60 + "\n") @@ -391,7 +391,7 @@ For each question, consider: :::::::::::::::::::::::: solution -The energy-cost question should produce a strong answer because the corpus contains multiple papers with concrete training-energy figures. The cloud-vs-HPC question requires the model to compare across sources — look for whether it hedges appropriately when papers disagree. The "best cloud provider" question is deliberately tricky: the corpus is about environmental costs of AI, not cloud provider rankings, so a well-behaved RAG system should indicate that the context doesn't support a definitive answer rather than generating marketing-style claims. +The elephant-weight question is deliberately off-topic — the corpus is about environmental costs of AI, not zoology, so a well-behaved RAG system should indicate that the context doesn't contain relevant information rather than answering from general knowledge. The cloud-vs-HPC question requires the model to compare across sources — look for whether it hedges appropriately when papers disagree. The "best cloud provider" question is deliberately tricky: the corpus is about environmental costs of AI, not cloud provider rankings, so a well-behaved RAG system should indicate that the context doesn't support a definitive answer rather than generating marketing-style claims. :::::::::::::::::::::::::::::::::