Skip to content

Add node for object detection using yolo obb inference#45

Merged
kluge7 merged 18 commits intomainfrom
feat/yolo-obb-inference
Feb 19, 2026
Merged

Add node for object detection using yolo obb inference#45
kluge7 merged 18 commits intomainfrom
feat/yolo-obb-inference

Conversation

@jenscaa
Copy link
Contributor

@jenscaa jenscaa commented Jan 29, 2026

Introduces a ROS 2 node for YOLO OBB-based object detection, publishing rotated bounding boxes and annotated image outputs.

@jenscaa jenscaa requested a review from kluge7 January 29, 2026 13:00
@jenscaa jenscaa self-assigned this Jan 29, 2026
@codecov
Copy link

codecov bot commented Jan 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (bc52ee0) to head (ef369dc).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main     #45   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files          3       3           
  Lines        144     144           
  Branches      14      14           
=====================================
  Misses       144     144           
Flag Coverage Δ
unittests 0.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces a new ROS 2 package for object detection using YOLO Oriented Bounding Box (OBB) models. The package subscribes to image topics, runs OBB inference, and publishes both structured detections with rotation information and annotated images.

Changes:

  • Added yolo_obb_object_detection package with node implementation for OBB-based object detection
  • Implemented utility functions for OBB detection and visualization with fallback to axis-aligned boxes
  • Configured launch files, parameters, and build system for ROS 2 integration

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
yolo_obb_object_detection/yolo_obb_object_detection/yolo_utils.py Utility functions for loading YOLO models and processing frames with OBB detection
yolo_obb_object_detection/yolo_obb_object_detection/yolo_obb_object_detection_node.py Main ROS 2 node that subscribes to images and publishes OBB detections and annotated outputs
yolo_obb_object_detection/requirements.txt Python package dependencies (numpy and ultralytics)
yolo_obb_object_detection/package.xml ROS 2 package metadata and dependencies
yolo_obb_object_detection/launch/yolo_obb_object_detection.launch.py Launch file with device validation and parameter loading
yolo_obb_object_detection/config/yolo_obb_object_detection_params.yaml Default parameters for model path, confidence threshold, and topics
yolo_obb_object_detection/README.md Package documentation with usage instructions
yolo_obb_object_detection/CMakeLists.txt CMake build configuration for the ROS 2 package
yolo_obb_object_detection/model/.gitkeep Placeholder for model directory
yolo_obb_object_detection/yolo_obb_object_detection/init.py Python package initialization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

)

r = results[0]
annot = r.plot()
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node uses YOLO's built-in r.plot() method to generate the annotated image (line 83), ignoring the custom annotation logic implemented in yolo_utils._draw_obb() and yolo_utils.process_frame(). This makes the custom drawing functions in yolo_utils.py unused and redundant. Either use the custom drawing logic for consistency with what might be future customization needs, or remove the unused drawing code from yolo_utils.py.

Suggested change
annot = r.plot()
annot = yolo_utils.process_frame(frame, r)

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +17
ALLOWED_DEVICES = ['cpu', '0', 'cuda:0']


def validate_device(device: str):
if device not in ALLOWED_DEVICES:
raise RuntimeError(
f"Invalid device '{device}'. Choose one of: {', '.join(ALLOWED_DEVICES)}"
)
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The device validation in ALLOWED_DEVICES is restrictive and may not cover all valid PyTorch/YOLO device specifications. Valid devices can include 'cuda:1', 'cuda:2', etc. for multi-GPU systems, 'mps' for Apple Silicon, and other formats. Consider either removing this validation to allow any device string, or expanding it to support the full range of valid device specifications (e.g., using a pattern like 'cuda:\d+').

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +18
<exec_depend>python3-opencv</exec_depend>
<exec_depend>python3-numpy</exec_depend>
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package declares both system package dependencies (python3-opencv, python3-numpy via exec_depend) and specific pip dependencies (numpy==1.24.4 in requirements.txt). This dual dependency declaration could lead to version conflicts or confusion about which versions are actually used. Consider either using only pip dependencies (removing lines 17-18) or only system packages (removing requirements.txt), depending on your deployment strategy. For reference, the similar package 'yolo_object_detection' does not include these system dependencies in package.xml.

Suggested change
<exec_depend>python3-opencv</exec_depend>
<exec_depend>python3-numpy</exec_depend>

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,23 @@
# yolo_obb_object_detection
Real-time object detection using YOLOv26. This node subscribes to an image topic, runs inference, and publishes both structured detections and an annotated image with oriented bounding boxes (OBB).
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README mentions "YOLOv26" which does not exist. YOLO versions include YOLOv3, YOLOv4, YOLOv5, YOLOv8, YOLOv10, and YOLOv11, but there is no YOLOv26. This appears to be a typo and should likely be "YOLOv8" or the appropriate version being used.

Suggested change
Real-time object detection using YOLOv26. This node subscribes to an image topic, runs inference, and publishes both structured detections and an annotated image with oriented bounding boxes (OBB).
Real-time object detection using YOLO-based models. This node subscribes to an image topic, runs inference, and publishes both structured detections and an annotated image with oriented bounding boxes (OBB).

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +100
def process_frame(frame, model, conf, device):
results = model.predict(frame, conf=conf, verbose=False, device=device)
detections = []
annotated = frame.copy()

for r in results:
# ---- Prefer OBB outputs if present ----
if hasattr(r, "obb") and r.obb is not None:
obb = r.obb
xywhr = (
obb.xywhr.cpu().numpy()
if hasattr(obb.xywhr, "cpu")
else np.asarray(obb.xywhr)
)
confs = (
obb.conf.cpu().numpy()
if hasattr(obb.conf, "cpu")
else np.asarray(obb.conf)
)
clss = (
obb.cls.cpu().numpy()
if hasattr(obb.cls, "cpu")
else np.asarray(obb.cls)
)

for (cx, cy, w, h, theta), sc, cid in zip(xywhr, confs, clss):
cid_i = int(cid)
sc_f = float(sc)
theta_f = float(theta)

detections.append(
(float(cx), float(cy), float(w), float(h), theta_f, sc_f, cid_i)
)

# Draw rotated box + label
box = _draw_obb(annotated, cx, cy, w, h, theta_f)
label = f"{model.names.get(cid_i, str(cid_i))} {sc_f:.2f}"

# Put label near first corner
x0, y0 = int(box[0][0]), int(box[0][1])
cv2.putText(
annotated,
label,
(x0, max(y0 - 5, 0)),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0, 255, 0),
1,
cv2.LINE_AA,
)
continue

# ---- Fallback to axis-aligned boxes ----
if not hasattr(r, "boxes") or r.boxes is None:
continue

b = r.boxes
xyxy = b.xyxy.cpu().numpy() if hasattr(b.xyxy, "cpu") else np.asarray(b.xyxy)
confs = b.conf.cpu().numpy() if hasattr(b.conf, "cpu") else np.asarray(b.conf)
clss = b.cls.cpu().numpy() if hasattr(b.cls, "cpu") else np.asarray(b.cls)

for (x1, y1, x2, y2), sc, cid in zip(xyxy, confs, clss):
cid_i = int(cid)
sc_f = float(sc)

detections.append((float(x1), float(y1), float(x2), float(y2), sc_f, cid_i))

p1, p2 = (int(x1), int(y1)), (int(x2), int(y2))
cv2.rectangle(annotated, p1, p2, (0, 255, 0), 2)
label = f"{model.names.get(cid_i, str(cid_i))} {sc_f:.2f}"
cv2.putText(
annotated,
label,
(p1[0], max(p1[1] - 5, 0)),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0, 255, 0),
1,
cv2.LINE_AA,
)

return detections, annotated
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function process_frame in yolo_utils.py is never used in the main node implementation. The node reimplements the detection logic directly in the on_image method. Either this function should be used to avoid code duplication, or it should be removed if it's not needed. Additionally, the process_frame function returns detections in a different format than what the node needs (tuples vs Detection2D messages), which may explain why it's not being used.

Suggested change
def process_frame(frame, model, conf, device):
results = model.predict(frame, conf=conf, verbose=False, device=device)
detections = []
annotated = frame.copy()
for r in results:
# ---- Prefer OBB outputs if present ----
if hasattr(r, "obb") and r.obb is not None:
obb = r.obb
xywhr = (
obb.xywhr.cpu().numpy()
if hasattr(obb.xywhr, "cpu")
else np.asarray(obb.xywhr)
)
confs = (
obb.conf.cpu().numpy()
if hasattr(obb.conf, "cpu")
else np.asarray(obb.conf)
)
clss = (
obb.cls.cpu().numpy()
if hasattr(obb.cls, "cpu")
else np.asarray(obb.cls)
)
for (cx, cy, w, h, theta), sc, cid in zip(xywhr, confs, clss):
cid_i = int(cid)
sc_f = float(sc)
theta_f = float(theta)
detections.append(
(float(cx), float(cy), float(w), float(h), theta_f, sc_f, cid_i)
)
# Draw rotated box + label
box = _draw_obb(annotated, cx, cy, w, h, theta_f)
label = f"{model.names.get(cid_i, str(cid_i))} {sc_f:.2f}"
# Put label near first corner
x0, y0 = int(box[0][0]), int(box[0][1])
cv2.putText(
annotated,
label,
(x0, max(y0 - 5, 0)),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0, 255, 0),
1,
cv2.LINE_AA,
)
continue
# ---- Fallback to axis-aligned boxes ----
if not hasattr(r, "boxes") or r.boxes is None:
continue
b = r.boxes
xyxy = b.xyxy.cpu().numpy() if hasattr(b.xyxy, "cpu") else np.asarray(b.xyxy)
confs = b.conf.cpu().numpy() if hasattr(b.conf, "cpu") else np.asarray(b.conf)
clss = b.cls.cpu().numpy() if hasattr(b.cls, "cpu") else np.asarray(b.cls)
for (x1, y1, x2, y2), sc, cid in zip(xyxy, confs, clss):
cid_i = int(cid)
sc_f = float(sc)
detections.append((float(x1), float(y1), float(x2), float(y2), sc_f, cid_i))
p1, p2 = (int(x1), int(y1)), (int(x2), int(y2))
cv2.rectangle(annotated, p1, p2, (0, 255, 0), 2)
label = f"{model.names.get(cid_i, str(cid_i))} {sc_f:.2f}"
cv2.putText(
annotated,
label,
(p1[0], max(p1[1] - 5, 0)),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0, 255, 0),
1,
cv2.LINE_AA,
)
return detections, annotated
# Note: previously there was a `process_frame` helper here that duplicated
# detection logic used directly in the main node. It has been removed to
# avoid unused, duplicate code and to keep this utility module focused on
# shared, actually-used helpers such as `_draw_obb`.

Copilot uses AI. Check for mistakes.

det_array.detections.append(det)
else:
self.get_logger().warn("No OBB output detected — is this an OBB model?")
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use get_logger().warning() instead of get_logger().warn(). The warn method is deprecated in Python's logging module and ROS 2's logger interface. The standard method is warning().

Suggested change
self.get_logger().warn("No OBB output detected — is this an OBB model?")
self.get_logger().warning("No OBB output detected — is this an OBB model?")

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +100
def process_frame(frame, model, conf, device):
results = model.predict(frame, conf=conf, verbose=False, device=device)
detections = []
annotated = frame.copy()

for r in results:
# ---- Prefer OBB outputs if present ----
if hasattr(r, "obb") and r.obb is not None:
obb = r.obb
xywhr = (
obb.xywhr.cpu().numpy()
if hasattr(obb.xywhr, "cpu")
else np.asarray(obb.xywhr)
)
confs = (
obb.conf.cpu().numpy()
if hasattr(obb.conf, "cpu")
else np.asarray(obb.conf)
)
clss = (
obb.cls.cpu().numpy()
if hasattr(obb.cls, "cpu")
else np.asarray(obb.cls)
)

for (cx, cy, w, h, theta), sc, cid in zip(xywhr, confs, clss):
cid_i = int(cid)
sc_f = float(sc)
theta_f = float(theta)

detections.append(
(float(cx), float(cy), float(w), float(h), theta_f, sc_f, cid_i)
)

# Draw rotated box + label
box = _draw_obb(annotated, cx, cy, w, h, theta_f)
label = f"{model.names.get(cid_i, str(cid_i))} {sc_f:.2f}"

# Put label near first corner
x0, y0 = int(box[0][0]), int(box[0][1])
cv2.putText(
annotated,
label,
(x0, max(y0 - 5, 0)),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0, 255, 0),
1,
cv2.LINE_AA,
)
continue

# ---- Fallback to axis-aligned boxes ----
if not hasattr(r, "boxes") or r.boxes is None:
continue

b = r.boxes
xyxy = b.xyxy.cpu().numpy() if hasattr(b.xyxy, "cpu") else np.asarray(b.xyxy)
confs = b.conf.cpu().numpy() if hasattr(b.conf, "cpu") else np.asarray(b.conf)
clss = b.cls.cpu().numpy() if hasattr(b.cls, "cpu") else np.asarray(b.cls)

for (x1, y1, x2, y2), sc, cid in zip(xyxy, confs, clss):
cid_i = int(cid)
sc_f = float(sc)

detections.append((float(x1), float(y1), float(x2), float(y2), sc_f, cid_i))

p1, p2 = (int(x1), int(y1)), (int(x2), int(y2))
cv2.rectangle(annotated, p1, p2, (0, 255, 0), 2)
label = f"{model.names.get(cid_i, str(cid_i))} {sc_f:.2f}"
cv2.putText(
annotated,
label,
(p1[0], max(p1[1] - 5, 0)),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0, 255, 0),
1,
cv2.LINE_AA,
)

return detections, annotated
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detections returned by process_frame for OBB (oriented bounding boxes) have a different structure (7 elements: cx, cy, w, h, theta, score, class_id) compared to the fallback axis-aligned boxes (6 elements: x1, y1, x2, y2, score, class_id). This inconsistency will cause issues for any consumer of this function trying to process the returned detections, as they won't know which format to expect. Consider returning a consistent structure or clearly documenting this behavior and providing separate functions for OBB and regular detection.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +10
return YOLO(model_path)


Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conf parameter passed to load_model is not used within the function. The function only returns YOLO(model_path) without utilizing the confidence threshold. Either remove the unused parameter or apply it to the model configuration if needed.

Suggested change
return YOLO(model_path)
"""
Load a YOLO model from the given path and optionally set a default
confidence threshold.
Parameters
----------
model_path : str
Path to the YOLO model weights.
conf : float or None
Default confidence threshold to apply to the model configuration.
If None, the model's internal default is used.
"""
model = YOLO(model_path)
# If a confidence threshold is provided, apply it to the model's config
if conf is not None:
try:
# Many Ultralytics YOLO versions support configuring defaults via `overrides`
model.overrides["conf"] = conf
except Exception:
# If for some reason `overrides` is not available or not writable,
# silently ignore to avoid breaking existing behavior.
pass
return model

Copilot uses AI. Check for mistakes.
Comment on lines +75 to +83
results = self.model.predict(
source=frame,
conf=self.conf,
device=self.device,
verbose=False,
)

r = results[0]
annot = r.plot()
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The yolo_utils module is imported and its load_model function is called, but then the model.predict is called directly in the on_image method rather than using yolo_utils.process_frame. This creates code duplication - the frame processing logic exists in yolo_utils.py but is reimplemented in the node. Consider using the process_frame function from yolo_utils to avoid duplication and maintain consistency.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +111
if r.obb is not None:
xywhr = r.obb.xywhr.cpu().numpy()
confs = r.obb.conf.cpu().numpy()
clss = r.obb.cls.cpu().numpy()

for (cx, cy, w, h, theta), sc, cid in zip(xywhr, confs, clss):
det = Detection2D()
det.header = msg.header

hyp = ObjectHypothesisWithPose()
hyp.hypothesis.class_id = str(int(cid))
hyp.hypothesis.score = float(sc)
det.results.append(hyp)

det.bbox = BoundingBox2D()
det.bbox.center.position.x = float(cx)
det.bbox.center.position.y = float(cy)
det.bbox.center.theta = float(theta) # radians
det.bbox.size_x = float(w)
det.bbox.size_y = float(h)

det_array.detections.append(det)
else:
self.get_logger().warn("No OBB output detected — is this an OBB model?")
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for the case when r.obb is None. While line 111 logs a warning when no OBB output is detected, this happens after attempting to publish an empty detection array. If the model is not an OBB model, this will result in publishing empty detections on every frame, which could be misleading to downstream consumers. Consider either raising an error during initialization to fail fast, or handling this case more gracefully (e.g., only warn once, or fall back to regular bounding boxes).

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 10 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +30 to +34
if r.obb is None:
raise RuntimeError(
"Loaded model does not output OBB predictions. "
"Make sure you are using a YOLO26-OBB model."
)
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model validation occurs during image processing rather than at initialization. If a non-OBB YOLO model is loaded, the node will crash on the first image callback with a RuntimeError. Consider adding validation during model loading (in _load_model method) by running a test inference with a dummy image to fail fast at startup rather than during operation.

Copilot uses AI. Check for mistakes.
return box


def process_frame(frame, model, conf, device):
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function process_frame lacks a docstring. The existing yolo_object_detection package includes docstrings for this function (see yolo_object_detection/yolo_utils.py:12-18), which explains the function's purpose, parameters, and return values. For consistency with the codebase, add a similar docstring here.

Suggested change
def process_frame(frame, model, conf, device):
def process_frame(frame, model, conf, device):
"""
Run YOLO-OBB inference on a single frame and return detections and an annotated frame.
Parameters
----------
frame : np.ndarray
Input image in BGR format (as read by OpenCV).
model : ultralytics.YOLO
Loaded YOLO-OBB model used to perform inference.
conf : float
Confidence threshold for filtering detections.
device : str or int
Device identifier passed to the model (e.g. "cpu", "cuda", or GPU index).
Returns
-------
detections : list[tuple[float, float, float, float, float, float, int]]
List of detections, each as (cx, cy, w, h, theta, score, class_id), where
(cx, cy) is the box center, (w, h) are width and height, theta is the rotation
angle in radians, score is the confidence, and class_id is the class index.
annotated : np.ndarray
Copy of the input frame with oriented bounding boxes and labels drawn.
"""

Copilot uses AI. Check for mistakes.
kluge7 and others added 5 commits February 18, 2026 17:27
…bject_detection_node.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…bject_detection_node.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@kluge7 kluge7 merged commit 474ae7e into main Feb 19, 2026
5 checks passed
@kluge7 kluge7 deleted the feat/yolo-obb-inference branch February 19, 2026 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants