Add CUDA memory management APIs by alinpahontu2912 · Pull Request #1524 · dotnet/TorchSharp

alinpahontu2912 · 2026-02-13T10:54:43Z

Fixes
Add the following torch.cuda APIs:

empty_cache() - Release unoccupied cached memory (add empty_cache function #1521)
memory_allocated() - Current GPU memory occupied by tensors
max_memory_allocated() - Peak GPU memory occupied by tensors
reset_peak_memory_stats() - Reset peak memory tracking
memory_reserved() - Current GPU memory managed by caching allocator
max_memory_reserved() - Peak GPU memory managed by caching allocator
mem_get_info() - Free and total memory on device
set_device() - Set current CUDA device
current_device() - Get current CUDA device index

These APIs are commonly used in PyTorch workflows for memory management and debugging, and are needed by TorchSharpExamples users.

Native implementations use c10::cuda::CUDACachingAllocator with #if defined(USE_CUDA) guards for CPU-only build compatibility.

Includes unit tests for all new APIs.

Add the following torch.cuda APIs: - empty_cache() - Release unoccupied cached memory (dotnet#1521) - memory_allocated() - Current GPU memory occupied by tensors - max_memory_allocated() - Peak GPU memory occupied by tensors - reset_peak_memory_stats() - Reset peak memory tracking - memory_reserved() - Current GPU memory managed by caching allocator - max_memory_reserved() - Peak GPU memory managed by caching allocator - mem_get_info() - Free and total memory on device - set_device() - Set current CUDA device - current_device() - Get current CUDA device index These APIs are commonly used in PyTorch workflows for memory management and debugging, and are needed by TorchSharpExamples users. Native implementations use c10::cuda::CUDACachingAllocator with #if defined(USE_CUDA) guards for CPU-only build compatibility. Includes unit tests for all new APIs.

Copilot

Pull request overview

This PR adds 9 new CUDA memory management APIs to TorchSharp, addressing issue #1521. These APIs provide essential functionality for monitoring and managing GPU memory usage, which is commonly needed in PyTorch workflows for debugging and optimization.

Changes:

Adds CUDA memory management APIs: empty_cache, memory_allocated, max_memory_allocated, reset_peak_memory_stats, memory_reserved, max_memory_reserved, mem_get_info, set_device, and current_device
Implements native C++ bindings with USE_CUDA guards for CPU-only build compatibility
Includes comprehensive unit tests for all new APIs
Bumps version from 0.106.0 to 0.106.1

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/TorchSharp/Torch.cs	Adds 9 new public CUDA memory management methods with XML documentation
src/TorchSharp/PInvoke/LibTorchSharp.THSTorchCuda.cs	Adds P/Invoke declarations for the new native methods
src/Native/LibTorchSharp/THSTorch.h	Declares native function signatures for CUDA memory APIs
src/Native/LibTorchSharp/THSTorch.cpp	Implements native functions using c10::cuda APIs with USE_CUDA guards and CPU-only stubs
test/TorchSharpTest/TestTorchSharp.cs	Adds comprehensive unit tests for all 9 new APIs
build/BranchInfo.props	Bumps patch version from 0.106.0 to 0.106.1
RELEASENOTES.md	Documents the new APIs in release notes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-16T10:45:56Z

src/TorchSharp/Torch.cs

+            /// so that those can be used in other GPU applications and visible in nvidia-smi.
+            /// </summary>
+            /// <remarks>
+            /// empty_cache() doesn't increase the amount of GPU memory available for PyTorch.


The documentation refers to "PyTorch" but this is TorchSharp. Update this to say "doesn't increase the amount of GPU memory available for TorchSharp" to accurately reflect the library being documented.

Suggested change

/// empty_cache() doesn't increase the amount of GPU memory available for PyTorch.

/// empty_cache() doesn't increase the amount of GPU memory available for TorchSharp.

Copilot · 2026-02-16T10:45:56Z

test/TorchSharpTest/TestTorchSharp.cs

+            // Set to device 0 (always valid if CUDA is available)
+            cuda.set_device(0);
+            Assert.Equal(0, cuda.current_device());


Consider saving the original device at the beginning of the test and restoring it at the end to avoid potential side effects on other tests. While the tests are marked as Sequential and most explicitly set devices, it's good practice to restore the original state. You can use a try-finally block or wrap it in a using statement pattern to ensure cleanup.

Suggested change

// Set to device 0 (always valid if CUDA is available)

cuda.set_device(0);

Assert.Equal(0, cuda.current_device());

// Set to device 0 (always valid if CUDA is available) and restore original device afterwards

try {

cuda.set_device(0);

Assert.Equal(0, cuda.current_device());

}

finally {

cuda.set_device(device);

}

Use c10::getDeviceAllocator (from c10/core/) instead of c10::cuda::CUDACachingAllocator (from c10/cuda/) in the non-CUDA build path. This allows memory_allocated, memory_reserved, max_memory_allocated, max_memory_reserved, empty_cache, reset_peak_memory_stats, and mem_get_info to return real values when CUDA is available at runtime, even when LibTorchSharp is compiled against the CPU-only libtorch. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…e context is set Use non-throwing c10::cuda::GetDevice() instead of c10::cuda::current_device() to safely resolve the device index when -1 is passed (default/current device). Falls back to device 0 if no CUDA device context has been established yet. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

alinpahontu2912 and others added 2 commits February 13, 2026 11:17

bump version number

19772d9

alinpahontu2912 requested a review from Copilot February 16, 2026 10:40

Copilot started reviewing on behalf of alinpahontu2912 February 16, 2026 10:40 View session

Copilot AI reviewed Feb 16, 2026

View reviewed changes

alinpahontu2912 and others added 2 commits February 16, 2026 16:00

K1T00 mentioned this pull request Feb 16, 2026

Add empty_cache for releasing GPU memory #892

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CUDA memory management APIs#1524

Add CUDA memory management APIs#1524
alinpahontu2912 wants to merge 4 commits intodotnet:mainfrom
alinpahontu2912:add_cuda_memory_apis

alinpahontu2912 commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	/// empty_cache() doesn't increase the amount of GPU memory available for PyTorch.
	/// empty_cache() doesn't increase the amount of GPU memory available for TorchSharp.

Conversation

alinpahontu2912 commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant