adding a tag on screenshots for mouse_click coordinate actions#240
adding a tag on screenshots for mouse_click coordinate actions#240
Conversation
|
Based on your review schedule, I'll hold off on reviewing this PR until it's marked as ready for review. If you'd like me to take a look now, comment
|
There was a problem hiding this comment.
Pull Request Overview
This PR adds functionality to tag screenshots with a red dot when a "mouse_click" action is detected, giving a visual cue for coordinate-based actions.
- Introduces the new function tag_screenshot_with_action to parse and render mouse_click coordinates on screenshots.
- Updates the update_screenshot and update_screenshot_pair functions to apply the tagging functionality.
There was a problem hiding this comment.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
| Category | Issue | Status |
|---|---|---|
| Missing Coordinate Parsing Error Handling ▹ view | ✅ Fix detected | |
| Inconsistent Screenshot Pair Annotation ▹ view |
Files scanned
| File Path | Reviewed |
|---|---|
| src/agentlab/analyze/agent_xray.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
src/agentlab/analyze/agent_xray.py
Outdated
| coords = [c.strip() for c in coords] | ||
| if coords[0].startswith("x="): | ||
| coords[0] = coords[0][2:] | ||
| if coords[1].startswith("y="): | ||
| coords[1] = coords[1][2:] |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| if s1 is not None: | ||
| s1 = tag_screenshot_with_action(s1, info.exp_result.steps_info[info.step].action) |
There was a problem hiding this comment.
Inconsistent Screenshot Pair Annotation 
Tell me more
What is the issue?
The screenshot annotation logic in update_screenshot_pair() only annotates the first screenshot (s1) but not the second one (s2).
Why this matters
This inconsistency means users won't see click markers on the second screenshot, making it harder to track the progression of actions across screenshot pairs.
Suggested change ∙ Feature Preview
def update_screenshot_pair(som_or_not: str):
global info
s1 = get_screenshot(info, info.step, som_or_not)
s2 = get_screenshot(info, info.step + 1, som_or_not)
if s1 is not None:
s1 = tag_screenshot_with_action(s1, info.exp_result.steps_info[info.step].action)
if s2 is not None and info.step + 1 < len(info.exp_result.steps_info):
s2 = tag_screenshot_with_action(s2, info.exp_result.steps_info[info.step + 1].action)
return s1, s2Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull Request Overview
This PR adds a new function to annotate screenshots with a red marker for mouse click coordinate actions, improving the visual context for debugging. The key changes include:
- Introducing tag_screenshot_with_action() to parse mouse click actions and draw a marker on screenshots.
- Updating update_screenshot() and update_screenshot_pair() to use the new tagging function.
src/agentlab/analyze/agent_xray.py
Outdated
| draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill="red", outline="red") | ||
| except (ValueError, IndexError) as e: | ||
| warning(f"Failed to parse action '{action}': {e}") |
There was a problem hiding this comment.
The function tag_screenshot_with_action does not return the modified screenshot after drawing the marker. Consider adding a return statement at the end of the function (both after a successful draw and outside the if block) so that the updated image is returned.
| draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill="red", outline="red") | |
| except (ValueError, IndexError) as e: | |
| warning(f"Failed to parse action '{action}': {e}") | |
| draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill="red", outline="red") | |
| return screenshot | |
| except (ValueError, IndexError) as e: | |
| warning(f"Failed to parse action '{action}': {e}") | |
| return screenshot |
Description by Korbit AI
What change is being made?
Add functionality to tag screenshots with action coordinates when the action is a mouse click, by drawing a dot on the screenshot at the specified coordinates.
Why are these changes being made?
The changes enhance the interpretability of screenshots by visually indicating where mouse click actions occurred, thereby aiding in better analysis and debugging during experiment result inspections. This approach provides a clear visual representation without modifying the core logic of action recording or screenshot capturing.