Skip to content

Comments

Add accumulative allocation tracking to allocators#3056

Open
JimBobSquarePants wants to merge 6 commits intomainfrom
js/accumulative-memory-limit
Open

Add accumulative allocation tracking to allocators#3056
JimBobSquarePants wants to merge 6 commits intomainfrom
js/accumulative-memory-limit

Conversation

@JimBobSquarePants
Copy link
Member

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

Fixes #2850

This pull request introduces a new accumulative allocation tracking system to the MemoryAllocator infrastructure, allowing for a configurable cap on the total memory allocated over the allocator's lifetime. It adds new configuration options, implements tracking logic, and updates all relevant allocators to enforce and release this new accumulative allocation limit. Other minor improvements include enhanced documentation and some code style refinements.

Accumulative Allocation Tracking and Limits

  • Added an accumulative allocation limit to MemoryAllocator, configurable via the new AccumulativeAllocationLimitMegabytes property in MemoryAllocatorOptions. This limit is enforced across all allocations, with platform-specific defaults (2GB for 32-bit, 8GB for 64-bit processes). [1] [2]
  • Implemented logic in MemoryAllocator to track total allocated bytes, reserve/release allocation, suppress tracking during batch operations, and wrap allocations with a tracking owner that releases the reservation upon disposal. [1] [2]
  • Updated SimpleGcMemoryAllocator, UnmanagedMemoryAllocator, and UniformUnmanagedMemoryPoolMemoryAllocator to enforce the accumulative allocation limit and to properly track and release allocations. [1] [2] [3] [4] [5]

Configuration and API Improvements

  • Extended MemoryAllocatorOptions with the new AccumulativeAllocationLimitMegabytes property, including validation and documentation. [1] [2]
  • Improved XML documentation throughout the allocator APIs to clarify the new behaviors and configuration options. [1] [2]

Interface and Code Style Enhancements

  • Made interface members in IMemoryGroup<T> explicitly public for consistency and clarity. [1] [2] [3]
  • Minor code style and documentation improvements in buffer and enumerator implementations. [1] [2]

These changes collectively provide stronger safety against excessive memory consumption, offer more configurability for consumers, and improve maintainability and clarity of the memory allocation subsystem.

@antonfirsov
Copy link
Member

I'm planning to take a look later this week.

@JimBobSquarePants
Copy link
Member Author

@antonfirsov Thanks mate!

Copy link
Member

@antonfirsov antonfirsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the complexity introduced in this PR doesn't justify the value.

Capping the total outstanding memory acts like a duplication of the platform OOM, but without the actual knowledge of the underlying memory constraints -- instead dying at an arbitrary value the user would need to fine-tune. I don't see this improving ease of use or security.

Even if we decide that a total cap is the way, I think a way simpler implementation should be possible. When allocating a group we can do the following:

  • Check if the allocator has the required memory throw if it doesn't. This should handle 99% of the cases without introduce any suppression mechanism.
  • As we process the individual allocations, escape at the first InvalidMemoryOperationException and dispose the already allocated buffers. Since this is the slow path anyways, the added allocation/deallocation ceremony should be ok.
  • Pull up the tracking logic to base classes.

I believe however, that it would be better to go with my original proposal from #2850. With that, allocating an outlier oversized GIF would fail with InvalidImageContentException would fail the same way (and for the same conditions) as allocating an outlier oversized JPG.

I can explore a minimalistic & heuristic solution for that. If it's not doable #2850 is probably not worth the trouble.

/// The allocation limit is determined based on the process architecture. For 64-bit processes,
/// the limit is higher than for 32-bit processes.
/// </remarks>
internal long AccumulativeAllocationLimitBytes { get; private protected set; } = Environment.Is64BitProcess ? 8L * OneGigabyte : 2L * OneGigabyte;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unwanted breaking change. Users operating high load app/service on decent hardware may hit this arbitrary limitation pretty quickly, requiring them to extend it manually.

Even if we choose to limit the total outstanding memory, the default limit should be infinite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaults to long.MaxValue now.

@JimBobSquarePants
Copy link
Member Author

I believe the complexity introduced in this PR doesn't justify the value.

Capping the total outstanding memory acts like a duplication of the platform OOM, but without the actual knowledge of the underlying memory constraints -- instead dying at an arbitrary value the user would need to fine-tune. I don't see this improving ease of use or security.

Even if we decide that a total cap is the way, I think a way simpler implementation should be possible. When allocating a group we can do the following:

  • Check if the allocator has the required memory throw if it doesn't. This should handle 99% of the cases without introduce any suppression mechanism.
  • As we process the individual allocations, escape at the first InvalidMemoryOperationException and dispose the already allocated buffers. Since this is the slow path anyways, the added allocation/deallocation ceremony should be ok.
  • Pull up the tracking logic to base classes.

I believe however, that it would be better to go with my original proposal from #2850. With that, allocating an outlier oversized GIF would fail with InvalidImageContentException would fail the same way (and for the same conditions) as allocating an outlier oversized JPG.

I can explore a minimalistic & heuristic solution for that. If it's not doable #2850 is probably not worth the trouble.

Thanks for the review @antonfirsov it's very much appreciated.

I understand the concern about duplicating OOM and introducing an arbitrary cap. The intent here isn't to second-guess the runtime, it's to provide an explicit circuit breaker at the only layer that has full visibility of actual allocated bytes. Something that triggers before we allow someone to bring down a server.

Decoder-level heuristics, whether frame count or pixel count, can't reliably model real memory usage because it depends on pixel format, intermediate buffers, processing, encoding, and concurrency. The allocator is the only place that sees the real byte pressure across all of that, which is why I chose to enforce the cap there rather than per decoder.

I agree that defaulting to a finite value can be seen as a breaking change. I'm absolutely fine with making the default unlimited so existing high-load workloads aren't impacted unless users opt in.

If there's a simpler way to structure the tracking within the allocator base types without the suppression mechanics, I'm open to that. The core goal isn't complexity, it's having a byte-accurate outstanding limit that protects long-running services from hitting OOM.

I don't see this as replacing the #2850 proposal. That addresses format-level validation symmetry. This addresses process-level memory safety across decode, encode, and processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Apply AllocationLimitMegabytes against the total size of multi-frame images before attempting to decode them

2 participants