feat: add trace visualization to display_sample_record (#396) #438
Open
feat: add trace visualization to display_sample_record (#396) #438
Conversation
Render LLM conversation traces (produced by `with_trace != TraceType.NONE`) as readable conversation flows in `display_sample_record()`. Two backends: Rich terminal panels (styled by role) and Jupyter HTML block diagrams. - New module `trace_renderer.py` with `TraceMessage` TypedDict and `TraceRenderer` class (`render_rich`, `render_notebook_html`) - `include_traces` parameter on both mixin and standalone function (defaults to True, opt out with `include_traces=False`) - Traces shown after Generated Columns table, before images - Unit tests for various trace shapes and integration tests Made-with: Cursor
Extract is_notebook_environment() utility to replace scattered get_ipython() try/except blocks. Improve Rich trace readability with better colors, separators, text folding, and dedented content. Match HTML trace font to Rich monospace output. Move index label to top of display and reduce inter-table spacing. Made-with: Cursor
Contributor
Greptile SummaryThis PR adds LLM conversation trace visualization to Key changes:
Issues found:
|
| Filename | Overview |
|---|---|
| packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py | New module providing Rich terminal and Jupyter HTML trace rendering; contains a logic error in the tool-call turn-count calculation that inflates "turns" for parallel tool calls. |
| packages/data-designer-config/src/data_designer/config/utils/visualization.py | Integrates trace rendering into display_sample_record; trace panels are excluded from render_list in notebook mode, causing them to be silently omitted from any saved HTML file when save_path is provided in a notebook. |
| packages/data-designer-config/src/data_designer/config/utils/misc.py | Adds is_notebook_environment() helper that correctly gates on ZMQInteractiveShell to avoid false positives; clean centralisation of a previously inline check. |
| packages/data-designer-config/tests/config/utils/test_trace_renderer.py | Comprehensive 338-line test suite covering Rich and HTML rendering paths; the saved-HTML integration test only runs in non-notebook mode so it does not cover the notebook+save_path omission case. |
Sequence Diagram
sequenceDiagram
participant Caller
participant display_sample_record
participant is_notebook_environment
participant render_list
participant TraceRenderer
participant IPython
Caller->>display_sample_record: record, config_builder, include_traces, save_path
display_sample_record->>is_notebook_environment: check
is_notebook_environment-->>display_sample_record: in_notebook (bool)
alt in_notebook == True
display_sample_record->>IPython: display HTML index label
else
display_sample_record->>render_list: append Text index label
end
display_sample_record->>render_list: append seed/generated/code/validation tables
alt include_traces == True
display_sample_record->>TraceRenderer: instantiate
loop each LLM column with trace side-effect
alt in_notebook == False
TraceRenderer->>render_list: append Rich Panel (render_rich)
end
display_sample_record->>display_sample_record: traces_to_display_later.append(...)
end
end
alt save_path set
display_sample_record->>render_list: Console.print → save HTML/SVG
Note over display_sample_record: traces NOT in saved file when in_notebook
else
display_sample_record->>render_list: Console.print to terminal
end
alt in_notebook and include_traces and traces_to_display_later
TraceRenderer->>IPython: render_notebook_html → display(HTML(...))
end
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py
Line: 157-160
Comment:
**Turn count is inflated for parallel tool calls**
`turn_ids` is populated with one entry per individual tool-call ID (line 124), so when a single assistant message issues *N* parallel tool calls (all IDs are distinct), `len(turn_ids)` equals N rather than 1. The summary will therefore read "2 tool calls in 2 turns" for the `multi_turn_tool_trace` fixture, which is a single LLM inference turn that returned two calls at once.
The "turns" semantic should count the number of assistant messages that contained tool calls — i.e. the number of separate LLM inference requests — not the number of unique call IDs. A simple fix:
```python
turn_count = sum(1 for msg in traces if msg.get("tool_calls")) if tool_call_count > 0 else 0
call_word = "call" if tool_call_count == 1 else "calls"
turn_word = "turn" if turn_count == 1 else "turns"
summary = f"{tool_call_count} tool {call_word} in {turn_count} {turn_word}" if tool_call_count > 0 else ""
```
This also makes the `turn_ids` set — which is currently only used to derive `turn_count` — unnecessary, and it can be removed.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: packages/data-designer-config/src/data_designer/config/utils/visualization.py
Line: 361-363
Comment:
**Trace content is silently dropped from saved files in notebook mode**
In a Jupyter notebook (`in_notebook=True`) trace panels are intentionally excluded from `render_list` and deferred to `traces_to_display_later` for IPython HTML rendering. The problem is that the `save_path` code path runs *before* that deferred display, and it only captures what is in `render_list`:
```python
# visualization.py – save path captures render_list only
if save_path is not None:
recording_console.print(Group(*render_list), markup=False) # ← no trace panels here
_save_console_output(recording_console, save_path, theme=theme)
# Traces are displayed afterwards, only to the live notebook cell
if in_notebook and include_traces and len(traces_to_display_later) > 0:
trace_renderer.render_notebook_html(...)
```
A user who calls `display_sample_record(record, builder, save_path="report.html")` from a notebook cell will see traces in the cell output but their saved file will contain no trace data at all. `test_display_sample_record_with_trace_in_saved_html` does not catch this because it runs in a non-notebook environment where traces *do* enter `render_list`.
Consider also adding the Rich trace panel to `render_list` when `save_path` is explicitly requested (regardless of `in_notebook`), so saved files are always complete:
```python
if not in_notebook or save_path is not None:
render_list.append(pad_console_element(trace_renderer.render_rich(traces, side_col)))
```
Alternatively, document the limitation clearly in the `include_traces` docstring so callers are not surprised.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "Merge branch 'main' ..."
packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py
Outdated
Show resolved
Hide resolved
packages/data-designer-config/src/data_designer/config/utils/visualization.py
Show resolved
Hide resolved
packages/data-designer-config/src/data_designer/config/utils/misc.py
Outdated
Show resolved
Hide resolved
- Remove double html.escape() on func_name in render_notebook_html; _build_html_block already escapes the title - Guard notebook trace display with include_traces to prevent potential NameError if trace_renderer is not instantiated - Improve is_notebook_environment() to check shell class name (ZMQInteractiveShell / google.colab._shell) instead of just get_ipython() existence, avoiding false positives in IPython terminals Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📋 Summary
Adds LLM conversation trace visualization to
display_sample_record, allowing users to inspect the full chain-of-thought, tool calls, and tool results forLLM-generated columns. Supports both terminal (Rich) and Jupyter notebook (HTML) rendering.
🔄 Changes
✨ Added
TraceRendererclass in [trace_renderer.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py) with Rich terminal and Jupyter HTML rendering
is_notebook_environment()helper inmisc.pyfor centralized notebook detectioninclude_tracesparameter ondisplay_sample_recordandWithRecordSamplerMixin(defaultTrue)test_trace_renderer.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-designer-config/tests/config/utils/test_trace_renderer.py) (338 lines, covering Rich rendering, HTML rendering, edge cases, and integration)
🔧 Changed
visualization.py: consolidated notebook detection to use sharedis_notebook_environment(), added trace rendering for all LLM column types, moved record indexdisplay to top, normalized padding defaults
🔍 Attention Areas
trace_renderer.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py) — New module: core rendering logic for both Rich and HTML output, typed trace message structures
visualization.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-designer-config/src/data_designer/config/utils/visualization.py) — Integration point: trace column discovery, rendering order, and padding changes
In ipynb notebook
In console
🤖 Generated with AI