RUBY-3894 Fix heap buffer overflow in put_string and related methods by jamis · Pull Request #372 · mongodb/bson-ruby

jamis · 2026-06-03T22:23:35Z

Description

RSTRING_LEN returns long, but five sites in ext/bson/write.c stored the result in int32_t, silently truncating strings ≥ 2³¹ bytes to a negative value. That value was then passed to rb_bson_utf8_validate as size_t, wrapping to near UINT64_MAX and driving reads far past the heap allocation (confirmed heap-buffer-overflow with AddressSanitizer).
Fixed pvt_bson_encode_to_utf8, rb_bson_byte_buffer_put_string, rb_bson_byte_buffer_put_cstring, rb_bson_byte_buffer_put_symbol, and pvt_put_bson_key by changing the length variable to long and adding an explicit > INT32_MAX bounds check before narrowing back to int32_t for the BSON wire format. Strings exceeding the limit now raise ArgumentError.
Regression tests added in spec/bson/string_length_overflow_spec.rb, gated behind STRESS=1 (each example requires ~2 GB of free memory).

RSTRING_LEN returns long, but the native string encoder was storing the result in int32_t. A string of exactly 2**31 bytes silently wraps to INT32_MIN, which is then passed to rb_bson_utf8_validate as size_t, wrapping to near UINT64_MAX and driving reads far past the heap. Change the length variable to long in pvt_bson_encode_to_utf8, rb_bson_byte_buffer_put_string, rb_bson_byte_buffer_put_cstring, rb_bson_byte_buffer_put_symbol, and pvt_put_bson_key. Add an explicit bounds check in each before the value is narrowed back to int32_t for the BSON wire format (BSON strings are limited to INT32_MAX bytes by spec). Strings exceeding that limit now raise ArgumentError. Regression tests are in spec/bson/string_length_overflow_spec.rb, gated behind STRESS=1 since each example requires ~2 GB of free memory.

Copilot

Pull request overview

This PR addresses a critical memory-safety issue in the BSON Ruby native extension where Ruby string lengths (RSTRING_LEN, a long) were truncated into int32_t, potentially wrapping into enormous size_t values and causing out-of-bounds reads during UTF-8 validation and write operations.

Changes:

Adds a centralized length guard (pvt_check_string_length) and switches length variables from int32_t to long in several string-writing paths, raising ArgumentError when oversized.
Narrows to int32_t only after validation to match BSON wire format constraints.
Adds stress-gated regression specs covering put_string, put_cstring, and put_symbol overflow protection.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`ext/bson/write.c`	Prevents int32 truncation of Ruby string lengths and adds explicit bounds checks before encoding/writing.
`spec/bson/string_length_overflow_spec.rb`	Adds STRESS-gated regression coverage for oversized string/symbol inputs triggering the prior overflow path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The binary-string encoding path writes length + 5 bytes total (4-byte int32 length prefix + payload + 1-byte null terminator). Allowing length == INT32_MAX caused that addition to overflow signed 32-bit arithmetic, which is undefined behavior in C and could in principle allow a miscompiled bounds check to reintroduce the class of memory safety issue this fix set out to close.

comandeo-mongo · 2026-06-08T12:19:55Z

  VALUE utf8_string;
  const char *str;
-  int32_t length;
+  long length;


Why do we want to change this?

It's because RSTRING_LEN returns long, not int32_t, so the result could (in extreme cases) be cast from long to a large negative int32_t, which could (potentially) cause problems.

jamis requested a review from a team as a code owner June 3, 2026 22:23

jamis requested review from comandeo-mongo and Copilot June 3, 2026 22:23

Copilot started reviewing on behalf of jamis June 3, 2026 22:24 View session

Copilot AI reviewed Jun 3, 2026

View reviewed changes

Comment thread ext/bson/write.c Outdated

comandeo-mongo reviewed Jun 8, 2026

View reviewed changes

comandeo-mongo approved these changes Jun 8, 2026

View reviewed changes

jamis merged commit 85fecdd into mongodb:master Jun 8, 2026
48 of 51 checks passed

jamis deleted the 3894-heap-buffer-overflow branch June 8, 2026 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RUBY-3894 Fix heap buffer overflow in put_string and related methods#372

RUBY-3894 Fix heap buffer overflow in put_string and related methods#372
jamis merged 2 commits into
mongodb:masterfrom
jamis:3894-heap-buffer-overflow

jamis commented Jun 3, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

comandeo-mongo Jun 8, 2026

Uh oh!

jamis Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jamis commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

comandeo-mongo Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

jamis Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jamis commented Jun 3, 2026 •

edited

Loading