Skip to content

Proposal: Add a 16-problem RAG / agent failure checklist as an advanced debugging section #199

@onestardao

Description

@onestardao

Description of the feature request:

The quickstart demonstrates a powerful deep research agent using Gemini + LangGraph. When users adopt it for their own topics and data, they inevitably run into subtle failures:

  • The agent misses obvious sources even though they are accessible on the web.
  • The reasoning chain oscillates or becomes shallow when questions are more complex.
  • Small config changes (search parameters, prompts) fix one scenario but quietly break others.
  • It is hard to tell whether a failure is caused by retrieval, graph structure, prompt design, or model choice.

Right now, each user has to invent their own mental model for these failures. An explicit 16-problem failure map gives them a common vocabulary and a concrete process: classify the failure mode first, then choose targeted fixes (adjust search, chunking, graph edges, or prompts) instead of random trial-and-error.

What problem are you trying to solve with this feature?

I’d like to propose a small docs-only addition to this quickstart: an “Advanced: diagnosing research agent failures (16-problem map)” section.

The idea is to document a compact failure checklist for the deep research agent shown in this repo. The checklist is based on a 16-problem map for RAG / agent failures (WFGY ProblemMap) and is purely text-based: users keep using this quickstart as is, but have an optional one-page poster + triage prompt they can consult when the agent behaves unexpectedly.

Concretely, the new docs section would:

  • Briefly explain that most failures in research agents fall into a small set of reproducible patterns (retrieval drift, chunking issues, config drift, search strategy problems, etc.).
  • Link to the 16-problem map and show how to use it: copy a failing trace (queries, retrieved snippets, answer) plus the poster into a strong LLM and ask, “Which failure modes apply here, and what structural fixes should I try first?”
  • Emphasize that this is an advanced, optional debugging tool and does not change the quickstart code or APIs.

Any other information you'd like to share?

The failure checklist I am suggesting is based on WFGY ProblemMap, an open-source 16-problem failure map for RAG / LLM pipelines and agentic systems (MIT-licensed).

It is already used in several ecosystems:

  • RAGFlow – integrates the map as an official RAG failure modes checklist in their docs.
  • LlamaIndex – incorporates it into their RAG troubleshooting documentation.
  • ToolUniverse (Harvard MIMS Lab) – wraps it as a triage tool for incident analysis.
  • It is also referenced by curated lists such as Awesome LLM Apps and Awesome-AITools as a diagnostics toolkit.

Map entry point (README + poster + triage prompt):

If this feature request is of interest, I’m happy to draft a concrete docs section (Markdown) tailored to this quickstart and open a PR so the team can review and adjust it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions